Linear equations over multiplicative groups, recurrences, and mixing I 
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Abstract. Let K be a field of positive characteristic. When V is a linear variety in K n 
and G is a finitely generated subgroup of K* , we show how to compute the set V fl G n 
effectively using heights. We calculate all the estimates explicitly. A special case provides 
the effective solution of the S-unit equation in n variables. 
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1. Introduction. In 2004 the second author published a paper [Mass] about linear 
equations over multiplicative groups in positive characteristic. This was specifically aimed 
at an application to a problem about mixing for dynamical systems of algebraic origin, 
and, as a result about linear equations, it lacked some of the simplicity of the classical 
results in zero characteristic. A new feature was the appearance of n — 1 independently 
operating Frobenius maps; here n is the number of variables. 

In 2007 the first author published a paper [D] about recurrences in positive char- 
acteristic. He proved an analogue of the famous Skolem-Lech-Mahler Theorem in zero 
characteristic. A new feature was the appearance of integer sequences involving combina- 
tions of d — 2 powers of the characteristic; here d is the order of the recurrence. 

It turns out that these two new features are identical. In positive characteristic the 
vanishing of a recurrence with d terms can be regarded as an linear equation in d — 1 
variables to be solved in a multiplicative group (so in particular n — 1 = d — 2). This 
observation will be developed in three directions. 

In the present paper we give an improved version of the result of [Mass] in a form 
more closely related to that in zero characteristic. In fact we shall prove some quantitative 
versions in which all the estimates are effective and furthermore we shall make them 
completely explicit. This is in sharp contrast to the situation in zero characteristic, where 
even in very simple circumstances there are no effective upper bounds for the solutions. 
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In a second paper we shall apply these results to recover the main theorem of [D], 
which we even generalize to sums of recurrences. In zero characteristic rather little is 
known about such sums, and indeed there is a conjecture of Cerlienco, Mignotte and Piras 
[CMP] to the effect that such problems are undecidable. In positive characteristic we will 
establish not only the decidability but also give completely effective algorithms to solve 
the problem. 

In a third paper we apply our linear equations results to give an algorithm to determine 
the smallest order of non-mixing of any basic action associated with a given prime ideal 
in a Laurent polynomial ring. From [Mass] we know that the non-mixing comes from the 
so-called non-mxing sets, and our work even provides a way of finding these. Again the 
algorithms are completely effective. 

We begin by recalling the classical result for a linear equation in zero characteristic, 
for convenience in homogeneous form. For a field K we write K* for the multiplicative 
group of all non-zero elements of K. For any subgroup G of K* and a positive integer n 
it makes sense to write P n (G) for the set of points in projective space defined over G. 

Theorem A (Evertse [E], van der Poorten-Schlickewei [PS]). Let K be a field of zero 
characteristic, and for n > 2 let ao, . . . ,a n be non-zero elements of K . Then for any 
finitely generated subgroup G of K* the equation 

a X + a 1 X 1 + ■ ■ ■ + a n X n = (1.1) 

has only finitely many solutions (Xq, Xi, . . . , X n ) in P n (G) which satisfy 

^OiXt^O (1.2) 

iei 

for every non-empty proper subset I of {0, 1, . . . , n}. 

We should point out that this remains true even when G is not finitely generated 
but has finite Q-dimension. See also a recent paper [EZ] of Evertse and Zannier for an 
interesting function field version. 

Theorem A is false in positive characteristic p; for example in inhomogeneous form 
for n = 2 the equation 

x + y = l (1.3) 

has a solution x = t, y = 1 — t over the group G in if = F p (t) generated by t, 1 — t; and 
so thanks to Frobenius infinitely many solutions 

x = tv e , y = l-r = {l-ty e (e = 0,l,2,...) (1.4) 
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which all satisfy (1.2). 

We can regard Theorem A as a descent step from the hyperplane H defined by equation 
(1.1) to proper linear subvarieties defined by the vanishing of the left-hand sides in (1.2). 
We can iterate this descent by introducing special varieties T defined solely by binary 
equations of the shape X\ = aXj (i ^j,o^0). For example T could be a single point or, 
when there are no equations at all, the full P n . We could call such varieties linear cosets 
or just cosets. This word has a group-theoretical connotation, and indeed T above is a 
translate of a group subvariety of the multiplicative group G™ in P n . Conversely it is not 
difficult to see that every linear translate of a group subvariety of G™ is a coset in our 
sense (see for example Lemma 9.4 p. 76 of [BMZ]). But we will in this paper make no use 
of these remarks or indeed hardly any further reference to group varieties. 

Anyway, it is easily seen that the complete descent yields a finite collection of cosets 
T, each contained in the original H, such that the full solution set H(G) = H D P n (G) 
coincides with the union of all T(G) = T D P n (G). This is a little closer to the more 
general context of Mordell-Lang (see below). No further descent from T(G) in terms of 
proper subvarieties is possible; by way of compensation it is very simple to describe T(G) 
explicitly (see for example the discussion towards the end of section 12). 

In positive characteristic we can establish a descent step similar to Theorem A, but 
it may involve Frobenius as in (1.4). This less simple situation makes the iteration more 
problematic, and for this reason it is clearer to present our result as a descent now from 
an arbitrary linear variety V to proper linear subvarieties. 

However the Frobenius does not always generate infinitely many solutions. It does 
above for x + y = 1, and also for 

t m x + y = l (1.5) 

by taking a new variable t m x; this is because t lies in G. The situation is slightly more 
subtle for (1.5) over the group Gi generated by t l and 1 — t; the above solution of (1.3) 
certainly leads to solutions 

x = t~^\ y = (l-tr e (e = 0,l,2,...), (1.6) 

but these will not be over Gi unless p e = m mod /. This can however happen for infinitely 
many e but not necessarily all e in (1.6). This time t may not lie in Gi but some positive 
power does. Finally the equation (1 + t)x + y = 1 has a solution x = 1 — t, y = t 2 over G, 
but the use of Frobenius will bring in an extra 1 + 1, no positive power of which is in G 
(provided p ^ 2). 
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These considerations lead naturally to the radical y/G = y/G for general G in general 
K* . For us this remains in K; thus it is the set of 7 in K for which there exists a positive 
integer s such that 7 s lies in G. Usually K will be finitely generated over its prime field, 
and then it is well-known that the finite generation of G is equivalent to that of y/G. We 
also see the need for some concept of isotriviality, already present in diophantine geometry 
at least since Neron's 1952 proof of the relative Mordell-Weil Theorem and Manin's 1963 
proof of the relative Mordell Conjecture. In our linear context the appropriate refinement 
is G-isotriviality, introduced by Voloch [V] for n = 2. 

Namely, let K be a field of positive characteristic p, and for n > 2 let V be a linear 
variety in P n defined over K. We say that V is G-isotrivial if there is an automorphism i(j 
of P n (K), defined by 

iP(X ,...,X n ) = (g X ,...,g n X n ) (1.7) 

with g , . . . , g n in G, such that ip(V) is defined over the algebraic closure F p . Such a ip 
could be called a G-automorphism. Let us write for F p fl K; then of course ip(V) is 
defined over Fx- So ip(V) is defined over some F q ; and now a point loony defined over 
G gives ip(w) on ip(V) which by Frobenius leads to points ip(w) qe (e = 0, 1, 2, . . .) on ip(V) 
and so 

^Mw)*') (e = 0,1,2,...) (1.8) 

on V, all still defined over G. 

Of course points over G are nothing other than zero-dimensional G-isotrivial varieties. 

Here is a preliminary version of our main descent step on linear equations. For V as 
above write V(G) = V fl P n (G) for the set of points of V defined over G. But it is clearer 
first to consider points over the radical y/G. 

Descent Step over y/G. Let K be a field of positive characteristic, and suppose that the 
positive-dimensional linear variety Vq defined over K is not a coset. Suppose also that y/G 
in K is finitely generated. Then there is an effectively computable finite collection W of 
proper y/G-isotrivial linear subvarieties W ofVo, also defined over K, with the following 
property. 

(a) IfVo is not y/G-isotrivial, then 

Vo(VG) = |J W(VG). 
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(b) IfVo is y/G -isotrivial and ip(Vo) is defined over F q , then 

vo(Vg) = r 1 ( U Q^w^))* 6 ) • 

\wew e=0 / 

Thus (a) says that the points of Vq(\^G) are not Zariski-dense in Vb; and (b) says that 
the points on Vo(y/G) like (1.8), which can be dense, at least arise from a set of w which 
is not dense. 

Part (a) was essentially proved for n = 2 as Theorem 1 by Voloch [V] (p. 196), and his 
Theorem 2 (p. 198) even covers the more general case of finite Q-dimension; here one gets 
the finiteness of the solution set. A forerunner of part (b) for n = 2 can be seen in Mason 
[Maso] (pp. 107,108). The main result of [Mass] is restricted to a single equation (1.1) and 
is expressed in terms of a concept of "broad" set; as we do not need this result here (or 
even the concept) we refrain from quoting it. However these authors do not discuss the 
effectivity in our sense (see the discussion below) . 

A simple example of (b) in inhomogeneous form is (1.3); this represents a line L, 
clearly isotrivial and even trivial in that we can take i/j as the identity automorphism. 
When G is generated by t and 1 — t in K = F p (i), then \/G is obtained by adding the 
elements of F* as generators. Leitner [Le] has found that for p > 3 there are p + 4 points 
W, six of which are like w = (t, 1 — t) in (1.4) and the remaining p — 2 are the w = (x,l—x) 
for x = 2, 3, . . . ,p — 1. 

So much for Vo(y/G). In the analogous characterization of V (G) there is no longer 
a clear separation of cases. In fact it can happen in case (b) above that the actions of 
Frobenius through q e can get truncated, so that each e remains bounded; but then it is 
easy to reduce this to case (a). A simple example is (1.5) for m = 1 in the group G = Gi 
above for / = p, when the solutions (1.6) are over G only when e = 0. Here is a general 
statement. 

Descent Step over G. Let K be a field of positive characteristic, and suppose that the 
positive-dimensional linear variety Vq defined over K is not a coset. Suppose also that \[G 
in K is finitely generated. Then there is an effectively computable finite collection W of 
proper \fG -isotrivial linear subvarieties W ofVo, also defined over K, such that either 

V (G) = |J W(G) 
wew 



5 



or 

V (G) = r 1 ( (J \J^(W)(G))A 

Wew e=0 / 

for some q and some VG- automorphism if) with ip(Vo) defined over F q . 
It may be instructive here to consider the inhomogeneous example 

x + y- z = 1 (1.9) 

still over the group G in K = F p (t) generated by t, 1 — t. Now (1.9) represents a plane P, 
also isotrivial and even trivial. Leitner [Le] has found that for p > 5 there are 22 lines W 
and 8 points W. For example the line defined by 

tx + y = l, z=(l-t)x (1.10) 

is one of these. So is the coset line defined by x = z, y = 1. And so is the point 

l_i (1-t) 2 

x = t, y = , z = . 

' y t ' t 

We can easily iterate the descent from (1.10). This is isotrivial via the automorphism 
ifj taking x, y, z to x = tx, y = y, z = jzr[Z, when the equations become x + y = 1, z = x. 
Now (1.4) (with e replaced by /) on (1.3) lead to the points w = (x, y, z) of W(G) with 

x = t^~\ y=(l-tf, z = tv f -\l-t) (/ = 0,1,2,...). 

Then from (1.8) (with q = p and the identity automorphism) we get the points 

x = t (l-l)r j y=(l-ty r , Z = t^~ 1)r (1 - t) r (1.11) 

of P(G); here q = p? and r = p e now indicate independently varying powers of p. This is 
precisely the example in [Mass] (p. 202). 

With the help of a suitable notation we can after all do the complete descent, also for 
linear varieties that are cosets; then the latter arise solely as obstacles. Denote by ip = (p q 
the Frobenius with <p(x) = x q . Let ifti, . . . ,iph be projective automorphisms. Then we 
imitate commutator brackets by defining the operator 

oo oo 

= = U ••• U ^rV i ^i)---(^V e,i V'/ l ), (i-i2) 

ei=0 e h =0 
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with of course the identity interpretation if h = 0. This formally resembles Definition 7.7 
of [D] (p.208). 

Theorem 1. Let K be a field of positive characteristic p , let V be an arbitrary linear vari- 
ety defined over K , and suppose that \/G in K is finitely generated. Then there is a power 
q of p such that V(G) is an effectively computable finite union of sets [ipi, . . . , iph]qT(G) 
with \/G -automorphisms ip\, . . . , iph (0 < h < n — 1), and cosets T contained in V. 

Here we see quite clearly the n — 1 Frobenius operators mentioned in the first para- 
graph of section 1. In general they act independently because they are separated by 
automorphisms. The example 

xi+x 2 -x 3 x n = 1 

generalizes (1.3) and (1.9), and it can be used to show that the upper bound n — 1 in 
Theorem 1 cannot always be improved. This we carry out in section 13 on limitation 
results. The same can also be seen indirectly through the applications to recurrences, 
where we will see that the analogous upper bound d — 2 cannot always be improved. 

Taking e± = 1 in (1.12) and all other zero, we see that 'i/'i" 1 is a G-automorphism. 
Similarly for ■ ■ ■ ■> V^" 1 L - However it may not always be possible to choose . . . , iph 

as G-automorphisms. This we also prove in section 13. 

We can also symmetrize the sets in Theorem 1. We explain this with the points (1.11) 
on P defined by (1.9). They can be written as 

x = t s ~\ y=(l-t) s , z = t s - r (l-t) r (1.13) 

with s = qr. Here there is asymmetry because apparently r divides s. However (1.13) has 
a meaning for any independent positive powers r, s of p; and it is easily checked that the 
resulting points remain on P. 

To formulate this in general we introduce another bracket notation more related to 
the group law. For points 7ro, 7Ti, . . . , TTh we define the set 

(7r ,7ri,...,7r h ) = (710,71-1,..., 717^ = 7T [J • • • [J (y^TTi) • • • (<p lh TT h ), (1-14) 

h=0 l h =0 

with of course the interpretation 7r itself if h = 0. We introduce more special varieties 
S defined solely by binary equations of the shape Xi = Xj. For example S could be the 
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single point with all coordinates equal or the full P n . We could call such varieties linear 
subgroups or just subgroups. As above it is not difficult to see that they are precisely the 
linear group subvarieties of G™ , but again we don't need to know this. 

Theorem 2. Let K be a field of positive characteristic p , let V be an arbitrary linear vari- 
ety defined over K , and suppose that \[G in K is finitely generated. Then there is a power 
q ofp such that V(G) is an effectively computable finite union of sets (ttq, 7Ti, . . . , 7Th) q S(G) 
with points tyq, 7Ti, . . . , TTh (0 < h < n — 1) defined over \fG and subgroups S. 

As in Theorem 1, the upper bound n — 1 in Theorem 2 cannot always be improved. 
We shall verify this in section 13. Also one can easily see that ttq" 1 ' ^i" 1 ' • • • > ^h" 1 ( as 
well as the product 7r 7Ti • • - TTh) are defined over G. However this may not always be true 
of 7To, 7Ti, . . . , TTh, as we shall also prove in section 13. 

When V is a hyperplane defined by (1.1) we can even descend to points, provided we 
restrict to (1.2) in the style of Theorem A. 

Theorem 3. Let K be a field of positive characteristic p , let H be defined by 

a X + a\X\ H h a n X n = 

for non-zero ao, ai, • • • , a n in K , and write H*(G) for the set of points in P n (G) satisfying 

iei 

for every non-empty proper subset I of {0, 1, . . . , n}. Suppose that \fG in K is finitely 
generated. Then there is a power q of p such that H*(G) is contained both in (1) an effec- 
tively computable finite union of sets [ipi, . . . , iph]q{ T } in H{G) with \[G -automorphisms 
ifji , . . . , iph (0 < h < n — 1) and points r, and in (2) an effectively computable finite union 
of sets (ttq, 7Ti, . . . , TVh) q in H(G) with points tvq, tt\, . . . , 7T/j (0 < h < n — 1). 

We do not prove it here, but in this situation H*(G) is precisely a finite union of 
[-01, . . . , if}h]q{T~}- However there seems to be a strange asymmetry between the asymmetric 
part (1) and the symmetric part (2). Namely it seems improbable that H*(G) is precisely 
a finite union of (ttq, tt\, . . . , 7th) q - For example, the point (1.13) on H defined by (1.9) is 
in H*(G) except for r = s, which disturbs the independence of r and s. 

Apart from the work [V] already mentioned, there are other results of this kind, now 
in the more general context of Mordell-Lang for arbitrary varieties V inside arbitrary 
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semiabelian varieties S. Typically here one intersects V with a finitely generated subgroup 
T of S; however in the present paper with S = G™ we have for simplicity restricted V to 
a Cartesian product G n . 

Thus the main result Theorem A (p. 104 - see also p. 109) of Abramovich and Voloch 
[AV] almost implies part (a) of our Descent Step over \f~G, except that they assume that 
V is not i^*-isotrivial and they have no information about W which would ensure linearity 
in our situation. The main result Theorem 1.1 (p. 667) of Hrushovsky's well-known paper 
[Hr] gives a similar implication. The restriction to our (a) corresponds to their restriction 
to the non-isotrivial case. Again these authors do not discuss the effectivity in our sense. 

After earlier work by Scanlon, the isotrivial case was treated by Moosa and Scanlon. 
Their Theorem B (p. 477) of [MS2] implies that our V(G) is what they call an F-set (see 
also [MSI]). Indeed in our situation and notation an F-set is nothing other than a finite 
union of (tto, tti, . . . , 7Th) q A(G) with itqtt\ • ■ • tt^ and 7TQ -1 , 7r^ _1 , . . . , 7r^ _1 defined over G 
and an algebraic subgroup A. However they do not prove the bound h < n — 1 and they 
do not give an estimate for A which would imply that it is linear because our V is. Their 
ideas were developed by Ghioca [Gh], who in addition extended the results to Drinfeld 
modules. See also the work [GM] of Ghioca and Moosa on division groups. Again there is 
no mention of effectivity. 

Now let us discuss this effectivity, a key aspect of the present paper. 

It is well-known that Theorem A (in zero characteristic) is semieffective in the sense 
that effective and even explicit upper bounds for the number of solutions of (1.1) subject to 
(1.2) can be found. However it is not fully effective in the sense that no upper bounds are 
known for the size of the solutions, even in very simple cases like K = Q and G generated 
by 3,5,7; and it is even unknown how to find all the finitely many non-negative integers 
a, 6, c satisfying an equation like 

3 a + 5 6_ 7 c = 1 

Out of the works in positive characteristic quoted above, only two discuss effectivity, 
and then only semieffectivity in the sense above. Voloch [V] in the theorems mentioned 
above gives explicit upper bounds for the cardinality of V(G) for n = 2 in case (a) of 
Theorem 1; these are uniform in the sense that they are independent of V and further 
they depend on G only with regard to its rank. A similarly uniform bound is given as 
Theorem 6.1 (p. 687) by Hrushovsky [Hr] for V in an abelian variety; however as it stands 
it is not completely explicit due to the use of non-standard analysis. These bounds are in 
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line with the well-known estimates in zero characteristic - see for example Theorem 1.1 of 
[ESS] (p.808). 

By contrast our results above are fully effective. This should be no surprise; for 
example it is rather easy by differentiating to find all non-negative integers a, 6, c with 

(3 + t) a + (b + t) b -(7 + t) c = 1 

in any fixed K = F p (t). We shall work out explicit bounds, at first for the Descent Step 
over \[G 1 where the exponents appearing can be reasonably small; and then for the Descent 
Step over G and Theorems 1,2 and 3. It would then be a straightforward matter to deduce 
bounds for the various cardinalities involved; but more work may be needed to make these 
uniform in the sense above. 

In fact the size bounds cannot be uniform in this sense. For example from the non- 
isotrivial equation x + ay = 1 with a = (m ^ p e ) over the group generated by t 

and 1 — t in F p (t), with solution x = t m , y = (1 — we can easily show that the size of 
solutions for fixed G must depend on V. Similarly the isotrivial equation x + y = 1 over 
the group generated by t m and (1 — t) m in F p (t), with the same solution, demonstrates 
that the size of solutions for fixed V must depend on more than just the rank of G. 

Because all our varieties are linear, we can measure them in a traditional way in terms 
of certain heights on the Grassmannian. We will show for example in the Descent Step 
over y/G that 

h(W) < Ch(V ) 2n (1.15) 

if W is no longer required to be v^G-isotrivial, where C depends only on K, n and G. If 
we insist on W being v^-isotrivial, then the exponent is not so small. The well-known 
Northcott Property of heights often implies that the set of W in (1.15) is finite and easily 
effectively computable. 

Perhaps since the results in zero characteristic are not effective, there is no tradition 
about measuring the groups T, even in S = G^. Because our V = G n , it is here possible 
to use a basis- free notion of regulator R(G). We will show that the bounds, at least when 
G = \/~G, are all of polynomial growth in R{G). For example in (1.15) we get 

C < cR(Gf n+2 

again if W is no longer required to be v^G-isotrivial, where c now depends only on K, n 
and the rank r of G. In fact here 

c = 8n 2 d(10?2 3 (n + 3(n+r) ) 2n+1 



10 



with d depending only mildly on K; for example d = 1 if K is a field of rational functions 
in several independent variables over a finite field. 

However we did find it a small surprise to discover that when G ^ \[G the smallest 
bounds can be exponential in R(G). A hint of this can be seen from the above discussion 
of (1.5) and Gi. For example the simplest solution of the equation 

t 42 x + y = l 

with x, y in the group generated by t 83 and 1 — t in F 2 (t) is 

x _ ^83^29130742641316365655570 y = (]_ _ ^2417851639229258349412352. ^ ^g) 

while the regulator is only 83^- For an explanation see the end of section 11. 

In section 12 we estimate the heights (in a natural sense) of all the quantities occurring 
in our Theorems. The bounds are polynomial in h(V) and R(G) if G = \/G; but otherwise 
they may involve an extra, possibly unavoidable, exponential dependence on R(G). Here 
too there is a Northcott Property to ensure effectivity. 

At first sight it may seem that the methods of [Mass] and [D] are unrelated. But there 
are close connections, and we give some hints of this in our exposition. Here we mention 
just that [Mass] works with derivatives and [D] works with p-automata and "free Frobenius 
splitting". For example over F p (t), [Mass] (p. 196) has 5i = (^) l (i = 0, . . . ,p — 1) while 
[D] (p. 198) splits F p (t) into a direct sum of one-dimensional F p (t p )-subspaces Vj (i = 
0, . . . ,p — 1) and considers the associated projections A^. In the natural case Vi = t l F p (t p ) 
one checks easily that the vectors (So, tdi . . . , t p ~ 1 5 p -i) and (Ao, Ai, . . . , A p _i) are connected 
via an invertible matrix over F p . So in some sense differentiating is equivalent to projecting. 
We can also quote Hrushovsky [Hr] (p. 669) "Distinguishing a basis for K/K p has the 
effect of fixing also a stack of Hasse derivations." As a matter of fact we do not use Hasse 
derivations in this paper (see the remarks at the end of section 5). 

Here is a brief section-by-section account of what follows. 

We begin in section 2 by explaining heights. Then in section 3 we introduce deriva- 
tions, and we use all this to give preliminary effective versions of the two main technical 
results of [Mass] about dependence over the field of differential constants. 

In section 4 we explain regulators, and in section 5 we use these to refine the work of 
section 3. 
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Then section 6 contains a technical result which enables us to identify isotriviality, 
and in section 7 we record some observations about automorphisms and heights of varieties 
V. 

We are now in a position in section 8 to make effective the main argument of [Mass] 
yielding the subvarieties W, at least for points over \/G and when V is either a hyperplane 
or trivial. We treat general V in section 9 but omitting the isotriviality of the W. This 
omission is then remedied in section 10 with a simple inductive argument, and in section 
11 we show how to treat points over G. We can then in section 12 prove effective versions 
of our Descent Steps and Theorems. 

Finally in section 13, as already mentioned, we show that various aspects of our results 
cannot be further improved. 

We would also like to draw attention to a very recent manuscript [AB] of Adamczewski 
and Bell for further work in the context of p-automata; in particular this covers also 
equations (1.1) and recurrences. 

2. Heights. The Theorems above for arbitrary fields can easily be reduced to the case 
when the field is finitely generated over its ground field F p (see section 12 below). In 
general let K be finitely generated over a subfield k in any characteristic. We shall define 
heights on K relative to k; thus we suppose that K is a transcendental extension of k. 
Here we do not know any basis- free notion of height, and thus we choose a transcendence 
basis B of K over k with elements t\, . . . , tb regarded as independent variables over k. The 
height h(a) = ^s(a) of an element a ^ of k[B] = k[ti, . . . , tb] will be its total degree deg a 
regarded as a polynomial; also h(0) = 0. The height can be extended to an element x of 
the quotient field k(B) = fc(£i, . . . , tb) by writing x = ^ for coprime polynomials a , ai in 
k[B] and defining 

h(x) = hs{x) = maxjdeg ao, deg ai}. (2.1) 

That suffices for most examples, but for mixing problems we have to extend further to all 
of K. This is a standard matter using valuations. 

There is a valuation on k[B] corresponding to total degree and defined by \a\oo = 
exp(dega) (a ^ 0); and of course |0|oo = 0. This extends at once to k(B) by multi- 
plicativity. And for every irreducible p in k[B] there is a valuation defined on k[B] by 
\a\ p = exp(— oj p (a) degp) (a ^ 0), where u p (a) is the exact power of p dividing a; and 
again |0|oo = 0. And it too extends to k(B) by multiplicativity. Using v to run over oo 
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and all the p, we have the product formula Y[ v \ x \v = 1 ( x 0) an d the height formula 
h(x) = log ]J V max{l, \x\ v }. 

Now X is a finite extension of k(B), say of degree d. Thus each valuation i> has finitely 
many extensions w to K, written w\v. In fact 

\x\ w = \N(x)\l/ d ™, (2.2) 

where the norm is from the completion K w to the completion k(B) v and d w is the relative 
degree. We also have J2 w \v dw = d. Now the product formula 

w 

holds. Further the formula 

h(x) = ^logJJmax{l,|x|^} 

w 

extends the height h = hjs to an absolute height on K. For all this see [La2] (pp. 1-19) or 
[BG] (pp.1-10). 

Actually for convenience in estimating we will use from now on the relative height 

h(x) = hs(x) = dh(x) > 1. 

This can be calculated directly from the minimum polynomial in the following extension 
of (2.1). 

Lemma 2.1. Suppose x in K satisfies an equation A(x) = with A(t) = aot e + - ■ - + a e for 
ao, . . . , a e in k[B] and A{t) irreducible over k[B]. Then eh(x) = rfmaxjdegao, . . . , dega e }. 

Proof. Over a splitting field L we have A(t) = ao(t — x\) ■ ■ ■ (t — x e ), and we can extend, 
keeping the same notation, all the valuations to L. Then Gauss's Lemma gives 

max{|a | 1( ,, . . . , |a e | w } = |a |u, max{l, \xi\ w } • ■ • max{l, |x e | w }. 

If w does not divide oo then the left-hand side is 1 because ao,...,a e are coprime; and 
otherwise they are all max{|ao|oo 5 • • • > l^doo}- Taking the product with exponents d w and 
then taking logarithms gives on the left-hand side <imax{deg ao, • • • , deg a e } and on the 

right-hand side h(xi) -\ h h(x e ). This last is just eh(x) because xi,...,x e are conjugate 

over k(B). 
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An immediate consequence of Lemma 2.1 is the Northcott Property; namely that for 
any H there are at most finitely many x in K with h(x) < H. 

We will also need the standard extensions to vectors. So for xi, . . . , x\ in K we define 
h(x u ...,xi) = logJJmax{l,|xi|*»,...,|xifr}. 

w 

For example h(ao, . . . , a e ) in the situation of Lemma 2.1 is just <imax{degao, . . . , dega e }. 
The Northcott Property extends at once to K l . 



3. Dependence with heights. Given K finitely generated and transcendental over 
k, there is always a separable transcendence basis B = (ti, . . . ,tf,); this means that K 
is separable over k(B). As above write d = [K : k(B)]. On k[B] we have the standard 
derivations . . . , which extend in the obvious way to k(B). And by separability 
they extend uniquely to K. For all this see [Lai] (pp. 183-184). For an integer i > we 
define V{i) as the set of operators 

\dtj "\dt b J 

as ii, . . . , if, run over all non-negative integers with %\ + • — h % < i. This is not quite the 
same as [Mass] (p. 196), where we had % > 1 and %\ + • • • + % < i. 

It will be convenient for later calculations to define a quantity h(x;i) as follows. We 
order in some way the operators Di, . . . , D\ of T>(i), and we define for x ^ 



h(x;i) = hs(x;i) = h 



Dix D\x 



X X 



of course independent of the ordering. 

The next result is an explicit version of Lemma 3 of [Mass] (p. 195) however without 
reference to any group G. We write C for the field of differential constants in K. For 
zero characteristic this is k, but for positive characteristic p it is the set of pth powers of 
elements of K. 

Lemma 3.1. For m > 2 suppose ci, . . . , c m are in C and x±, . . .x m are in K* with 

c 1 x 1 -\ hc m i m = 1. (3.1) 

Then either 
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(a) h(c 1 xi,...,c m x m ) < (m + 1) {h{x\ ; m — 1) H h /i(x m ; m - 1)) 

or 

(7>,) xi, . . . , x m are linearly dependent over C . 

Proof. If (b) does not hold, then the theory of the generalized Wronskian (see for example 
[La2] p. 174) shows that we may find operators Di in T>{%) (i = 0, . . . , m — 1) such that the 
matrix with entries DiXj (i = 0, . . . , m — 1; j = 1, . . . , m) is non-singular. Applying them 
to (3.1) we get 

m _ 

E^(%) = A(l) (x = 0,...,m-l). 

These can be solved by Cramer's Rule to get CjXj = ^ (j = 1, . . . , m), where w ^ is 
the determinant of the matrix with entries - (i = 0, . . . , m — 1; 7 = 1, . . . , m). Noting 
that this determinant is multilinear in the columns, we find that h(wo) < h(xi, m — 1) + 

h h(x m ; m — 1). The same bound holds for the h(wj) (j = 1, . . . , m). We conclude that 

h{dx u . . . , c m x m ) =h(^,..., ^) is at most 

h(w ) + h(wi) H \-h(w m ) < (m + 1) (h(xi, m — 1) H h /i(x m ; m - 1)) 

as required. 

We deduce an explicit version of Lemma 4 of [Mass] (p. 197), also without G. 

Lemma 3.2. For m > 2 suppose xo,xi, . . .x m are in K* and linearly dependent over C 
but linearly independent over C . Then there is a relation 

c 1 x 1 H h c m x m = x (3.2) 

with ci, . . . , c m in C and 

Proof. There is certainly a relation (3.2) with ci, . . . , c m in C, and we apply Lemma 3.1 
to the quotients 21 , . . . , As linearly independent over C, the conclusion 

(b) cannot hold. Now conclusion (a) is just what we need, and this completes the proof. 

In section 5 we shall prove versions of Lemmas 3.1 and 3.2 that are uniform for 
xq, xi, . . . , x m in a finitely generated group G as in [Mass]. By way of preparation, the 
next result illustrates the logarithmic nature of the quantities h{ ; i). 
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Lemma 3.3. For any x ^ 0, y ^ in K and any integers i > 0, e > we have h(xy; i) < 
h(x; i) + h(y; i) and h(x e ; i) < ih(x; i) . 

Proof. Let D be in T>(i). By distributing operators over the factors of xy as in Leibniz, we 
see that D( f^ is a sum with generalized binomial coefficients of products Mf) £M w ith 

xy x y 

operators E, F also in V(i). Taking D = D\,...,Di as in the definition of h{xy;i), we 
deduce the first inequality of the present lemma by standard height calculations. 

When e is a positive integer, a similar argument shows that D £. ^ is a sum with 
generalized binomial coefficients of products ■ ■ ■ Ee ^ with operators Ei, . . . , E e also 

in V(i). Here E\---E e = D, so that there are at most i terms not equal to 1 in this 
product. Thus D( ^ e - is a polynomial of total degree at most i in the ^jp- for E in V{i). 
The second inequality now follows in a similar way, at least for e > 1. The result is trivial 
for e = 0. 

Lemma 3.4. For any x ^ in K and any integer i > we have h(x; i) < 4idh(x). 

Proof. This is trivial for i = 0, so we assume from now on i > 1. We have an equation 
A(x) = as in Lemma 2.1, of degree e < d. Denote by A'(t) the derivative with respect 
to t. Pick any D in T>(i). We claim that Bi = (A'(x)) 2 ' l ~ 1 Dx is a polynomial in x and 
various derivatives D a of various coefficients a of A, with coefficients in k and of degree 
at most (2i — l)(e — 1) + 1 in x and of total degree at most 2i — 1 in the D a. We prove 
this by induction on i. 

When i = 1 we have for example D = = d (say), and applying this to A(x) = 
yields B\ = — X^o^^-j) 2 ^ f° r which the claim is clear. 

Assuming Dx = ^i^yja-i with Bi as above, we do the induction step by applying 
one more operator, again say = d. We get 

(A'(x)) 2l dDx = A'{x)dB i -(2i-l)B i d{A'(x)). 

Here dBi involves x to degree at most (2i — l)(e — 1) + 1 and also x to degree at most 
(2i — l)(e — 1) multiplied by dx = A ^ , together with D a to total degree at most 2i — 1. 
Similarly d(A'(x)) involves x to degree at most e — 1 and also x to degree at most e — 2 (if 
e^l) multiplied by dx = ^r^y, together with D a to total degree at most 1. Multiplying 
by A'{x) we get (A' (x)) 2t+1 dDx involving x to degree at most 

e-l + max{(2z-l)(e-l) + l + (e-l), (2i - l)(e - 1) + e} = (2(z + 1) - l)(e - 1) + 1, 
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and the degree in D a is at most (2i — 1) + 1 + 1 = 2(i + 1) — 1. This proves the claim in 
general. 

There follows at once the estimate 

log | ^ | ^ < ((2i - l)(e - 1) + l)logmax{l, \x\ w } 

for any w not dividing oo; and otherwise we get an extra term (2i— 1) maxjdeg ao, . . . , deg o e }. 
The same estimates also hold for log \C\ W where C = x(A' (x)) 21-1 . 

Now write Bij for the Bi corresponding to the operators Dj (j = 1, . . . , I) of T>(i), so 
that ^ = Then 

— ) = d w max{log \B a \ w , . . . , \og\Bu\ w , log \C\ W } 
\ x XI 

x ' w 

which is at most 

((2z — l)(e — 1) + l)h(x) + (2i — l)<imax{deg a , . . . , dega e }. 
Finally by Lemma 2.1 this is at most 

((2i - l)(e - 1) + l)h(x) + (2i - l)eh(x) < Aieh(x) < Aidh(x) 

as required. This completes the proof of the present lemma. 

In view of our consistent use of the relative height (as opposed to the absolute height), 
the factor d here looks like a normalization error. However it cannot be avoided, as the 
example x = (t = t\) in K = k(t)(x) = k(x) shows. One finds that the rational 

function ^§^f has denominator (t(t + 1))\ So its height is at least 2id = 2idh(x), which 
shows also that our dependence on i is not too bad. Perhaps even the factor 4 essentially 
cannot be avoided. 



4. Regulators. Let K be finitely generated and transcendental over k as in the preceding 
section, and let B be a transcendence basis. Let G be a subgroup of K* finitely generated 
modulo k*; that is, Gj (Gdk*) is finitely generated. We show here how to define a regulator 
R{G)=R B {G). 

For all w except finitely many we have \g\ w = 1 for every g in G. Pick a set of iV > 1 
valuations containing these exceptions. We order the set to produce a map C from G into 
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whose typical coordinate is d w log \g\ w . In fact by (2.2) C(G) lies in Z N and is therefore 
discrete. Thus it is a (full) lattice in the real subspace it generates, whose dimension is the 
rank r of G/(G fl k*). If r > 1 we define the regulator just as the determinant 

R(G) = Rb(G) = det£(G) > 1; 

clearly independent of the set above or its ordering, and if r = we define R(G) = 1. This 
does not quite coincide with the standard definition for the unit group in algebraic number 
theory, because the latter is obtained by a projection to one dimension lower. But they 
are equal up to a constant factor. 

The following example will be quoted later. With K = F p (t) (and the obvious B) and 
Gi generated by t l and 1 — t we have N = 3 corresponding to valuations at t = 0, 1, oo; 
and so vectors (/, 0, /) and (0, 1, 1) giving Rg(Gi) = /a/3. 

Lemma 4.1. Let G, G' in K* be finitely generated modulo k* with G of finite index in G' . 
Then 

R{G) = [G ,^V G Gnk*] R{G,) = lG'/(G'nk*):G/(Gnk*)]R(G'), 

where we identify Gj (G fl k*) as a subgroup of G' / (G' fl k*). 

Proof. The quotients G/ (G fl k*), G' / (G' fl k*) are torsion-free, both with the same rank, 
say r. If r = the lemma is trivial. Otherwise using elementary divisors we can find 
generators 71, . . . , 7 r of G' / (G' fl k*) and positive integers di, . . . , d r such that 7^ , . . . , 7^ 
generate G/(G fl k*). Then the relationship between C(G') and £(G) is clear, and the 
lemma follows. 



Lemma 4.2. Let G in K* be finitely generated modulo k* , let x be in K* , and let G' be 
the group generated by x and the elements of G. Then R(G') < 2h(x)R(G) . 

Proof. It is geometrically clear that if A is any lattice in euclidean space, then det(A+Zv) < 
det(A)|v| for the length, at least if v is not in the space spanned by A. But this continues 
to hold for all v provided only |v| > 1 and A + Zv remains discrete. In particular it holds 
for A = C(G) and v = C(x). We conclude R(G') < \C(x)\R(G). Finally we have by 
definition and the product formula 

K x ) = ^max{0, m w } = ^^2\m w \ (4.1) 

w w 
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for m w = d w log \x\ w . And 

i£(x)i 2 = J2 m l ^ (Ew) 2 = 4 (^)) 2 - 

w w 

The lemma follows. 

We can recover a basis from the regulator as follows. 

Lemma 4.3. Let G be a subgroup of K* finitely generated modulo k* with G/(GC\ k*) of 
rank r > 1. Then there are gi, . . . , g r in G generating Gj [G n k*), with 

%i) •••%,) < U{r)R{Gf 

for b~(r) = r 3r . 

Proof. By Minkowski's Second Theorem (see for example [Ca] Theorem V p. 218) there are 
gi, . . . ,g r in G multiplicatively independent modulo /c*, with 

\£(~ gi )\---\£(~g r )\ < ^det£(G) = ^y^(G) (4.2) 

for the Euclidean norms and the volume V(r) of the unit ball in R r . By geometry V(r) > 
(-^) r . We get a basis in the standard way using the argument of Mahler- Weyl (see for 
example [Ca] Lemma 8 p. 135); there results 

\C( 9l )\ < max{l,^}|£(&)| (i = l,...,r), 

and so in (4.2) gets replaced by t^-tW 2 < ^f=t- Now (4.1) gives 

Kq) = ^max{0,m m } = ^^|m w | 

w w 

for m w = d w log \g\ w in Z. And |m| < m 2 for any m in Z, so we get 

Kg) < \j2 m l = \\^9)\ 2 - 

w 

Therefore 

A r 3r i 

%i) •••%,) < -^RiG) 2 < -d(r)R(G) 2 

as desired. 
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In view of (4.2) it seems a pity that the square of the regulator appears in Lemma 
4.3. But it cannot be avoided. For example let cki, . . . , ai, /3\, . . . , /3i be different constants 
in /c, and consider G generated by g = j,, ~jjzr0^ i n 

K = k(t). Then R(G) = V21. 
The only possibilities for g\ are 7<7 ±:L with 7 constant. But then h(gi) = Z, so any bound 
h(gi) < S(1)R(G) is impossible. 

This leads to the following uniform version of Lemma 3.4 when x lies in G. Write G^ 
for the group generated by the elements of G and k* . 

Lemma 4.4. Let G be a subgroup of K* finitely generated modulo k* with G/(GC\ k*) of 
rank r > 1. Then for any g in G we have h(g; i) < 4i 2 dd(r)R(G) 2 . Further for any positive 
integer I there is go in Gk and g' in G with g = gog n and h(go) < l5(r)R(G) 2 . 

Proof. Choose basis elements g±, . . . , g r according to Lemma 4.3, and write g = cg^ 1 ■ ■ ■ g% r 
for rational integers ei,...,e r and c in k* . Replacing some of the gj by their inverses, 
we can assume that all ej > 0; this does not affect the estimate in Lemma 4.3. Then by 
Lemma 3.3 

h(g;i) = %f ■■■g^;i) < h{g e 1 1 -i) + --- + h{g e /-i) < i(h( 9l ;i) + ■ ■ ■ + h(g r ;i)). 

This in turn by Lemma 3.4 is at most 

4i 2 d(h( gi ) + --- + h(g r )) < 4i 2 drh( gi )---h(g r ) < 4i 2 d5(r)R(G) 2 (4.3) 

as required in the first part of the present lemma. And the second part follows by writing 
e j = fj + ^ e 'j w ^h < fj < I (j = l,...,r) (compare also [D] p. 197), taking g = 
cg± ■ ■ ■ gl r , g' = g^ ■ ■ ■ gl r and using the inequality in (4.3). 

The final result of this section will lead easily to a quantitative version of Lemma 2 of 
[Mass] (p. 193), such as those mentioned in [Mass] (pp 194,195). However it involves better 
constants and is no longer restricted to positive characteristic. It is here, by the way, that 
the radical \[G makes its essential appearance in the whole story. 

Lemma 4.5. Suppose that x, y are in K* with x not in \fG~k and ^ inG for some positive 
integer q. Then q < 2h(x)R(G) . 

Proof. Let G' be the group generated by x and the elements of G, and let G" be the group 
generated by y and the elements of G, so that G' lies in G" . Since x is not in \J~G, it is 
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easy to see that the index [G" : G'] = q. Since x is not even in \fG~k, it is even easier to 
see that G H k* = G' n k* = G" n fc*. Thus by Lemma 4.1 we have R(G') = qR(G") > q. 
On the other hand R(G') < 2h(x)R(G) by Lemma 4.2, and the result follows. 

5. Dependence with regulators. Let K be finitely generated and transcendental over 
k as in the preceding sections, and let B be a transcendence basis, now assumed separable, 
with elements ti, . . . , t^. We continue to abbreviate the height hs as h, and again we write 
C for the field of differential constants of K. 

The following result eliminates the height functions h(x,m — 1) from Lemma 3.1, 
thereby providing a more useful explicit version of Lemma 3 of [Mass] . 

Lemma 5.1. Let G in K* be finitely generated of rank r > 1 modulo k* , and for m > 2 
suppose ci, . . . , c m are in C and g±, . . .g m are in G with 

cigi H h c m g m = 1. 

Then either 

(a) h(c 1 g 1 ,...,c m g m ) < 4m*d5(r)R(G) 2 

or 

(b) <7i, . . . ,g m are linearly dependent over C . 
Proof. Just use Lemma 3.1 together with the inequalities 

%;m-l) < A{m-l) 2 d8{r)R{G) 2 (5.1) 

from Lemma 4.4, with g = gi, . . . , g m . 

Similarly we deduce a more useful explicit version of Lemma 4 of [Mass] . 

Lemma 5.2. Let G in K* be finitely generated of rank r > 1 modulo k* , and for m > 2 
suppose go, gi, ■ ■ ■ g m are i n G and linearly dependent over C but gi,---g m are linearly 
independent over C . Then there is a relation 

c\g\ H V Cmgm = go 

with ci, . . . , c m in C and 

h (^ ^n^\ < Am ± d 5(r)R(G) 2 . 
\ 9o 9o J 
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Proof. Just use Lemma 3.2 and (5.1), this time with g = . . . , 

We have followed the proof in [Mass] quite closely. It would have been nice to see the 
well-known number m ( T ^~ 1 ) j n place of 4m 4 , and also some notion of genus and .S-units as 
in various formulations of abc matters over function fields. But despite the considerations 
of Chapter 14 of [BG] in zero characteristic and those of Hsia and Wang [HW] for arbitrary 
characteristic we have been unable to supply a satisfactory version. The results of [HW] 
are especially interesting in their use of divided derivatives or hyperderivations, which for 
example in characteristic p leads to linear dependence over the field of p e th powers, not just 
over C with e = 1. If this could be done in our situation, then it would probably lead to 
simplifications in the rest of our proof, and possibly to the elimination of the Proposition in 
section 8. But it seems that the results of [HW] cannot be directly applied to our Lemma 
5.1, due to the presence of ci, . . . , c m whose heights cannot be controlled. 



6. Isotriviality. We take a well-earned break from estimating. From now on K will 
have positive characteristic p (actually this assumption is not really needed until section 
8), and, as in section 1, we write F K for F p n K. This field plays the role of k in sections 
2,3,4,5. 

Suppose n > m > 1. For a(i,j) in K the normalized equations 

m— 1 

Xi = a(i,0)X -\ \- a(i,m — l)X m -i = a(i,j)Xj (i = m, m + 1, . . . , n) (6.1) 

3=0 

define in P n a linear variety V of dimension m — 1. When G is a subgroup of K*, we need 
some conditions which ensure that V is G-isotrivial. 

Now any G-automorphism taking (X , . . . , X n ) to (goX , . . . , g n X n ) leads after renor- 
malization to new coefficients ^a(i,j). If the new forms are defined over F^, then every 
non-zero a(i,j) has the shape ^a{i,j) for non-zero a(i,j) in F^. In particular each 
equation in (6.1) defines a G-isotrivial variety. But also each quotient 

a(ii,ji)a(i2,j2)a(i 3 ,j 3 ) ■ ■ ■ a(i k -i, jk-i)a(ik, jk) 



a(ii,h)a(i2,j3>)a(i3,j4) ■■■ a(i k -i,jk)a{ik,ji) 



(fc = 2,...,n+l), (6.2) 



with everything in the numerator and denominator non-zero, lies in F^. The following 
result gives a converse statement which guarantees that the equations (6.1) become defined 
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over Fk after applying a suitable G-automorphism and renormalizing. In particular it 
guarantees that V is G-isotrivial; but without the need to recombine the equations. 

Lemma 6.1. Suppose that each equation in (6.1) defines a G-isotrivial variety, and that 
each quotient (6.2) lies in provided everything in the numerator and denominator is 
non-zero. Then V is G-isotrivial. 

Proof. We argue by induction on the number n — m + 1 > 1 of equations. If n — m + 1 = 1 
then the result is obvious without using (6.2). Suppose we have done it for n — m > 1 
equations, namely the first n — m in (6.1), and let us add another equation, namely the 
last one in (6.1). 

Restricting to i < n and the appropriate j in (6.2), we see from the induction hypothe- 
sis that a suitable G-automorphism trivializes the first n — m equations, without bothering 
about X n . This means that we can assume that all a(i,j) ^ (i < n) are in F^-; while 
the isotriviality of the last equation means that all a(n,j) ^ are in G. We now want to 
trivialize the last equation. 

How can we trivialize a given coefficient a(n,j) ^ in the last equation? If all 
a(i,j) = (i < n), so that the first n — m equations did not involve Xj, then we can 
simply replace Xj by a(n,j)Xj and this will not change the first n — m equations. We do 
this for all such j. 

If there is only a single j with some a(i,j) ^ (i < n), then we can still replace Xj 
by a(n,j)Xj; but we then have to correct the new coefficients "(n^) 7^ °f Xj in the ith 
equation by replacing Xi by a(n, j)Xi (i = m, . . . , n — 1). Things are less easy when there 
is more than one such j. Call these "bad". 

Now we say for different in the set {0, . . . , m — 1} that j ~ j' if there is i < n 

with 

a(i,j)a(i,j f )^0 (6.3) 

(in particular then are both bad). This relation is symmetric but probably not tran- 
sitive. We can extend it via reflexivity and transitivity to a genuine equivalence relation 
on the bad elements of {0, . . . , m — 1}, which we then denote by ~. 

We assume for the moment that there is a single equivalence class: any two j, j' are 
related. 
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Let j, f be different bad elements, so that a(i,j) ^ 0, a(i',j') ^ for some < n. 
From our equivalence class assumption j ~ j' . Suppose that 

3 = h ~ h ~ • • • ~ jk-i ~ j'fc = /, 
where of course we can take 2 < k < n + 1. Then we get from (6.3) 

h) 7^0, a(z 2 , j 2 )a(z 2 , J3) 7^ 0, . . . , a(i k -i,jk-i)a(ik-ijk) ^0 

for some h, z 2 , . . . , ik-i < n. We use (6.2) with i k = n to see that 

q(ii, ji)q(i 2 , j 2 )q(i 3 , j 3 ) ■ ■ -afa-i, j k -i)a(n,j') 
a(i 1 J 2 )a(i2js)a(i 3 j4) ••• a(i k -i,jk)a(nj) 

lies in F^. However the first fe — 1 terms in both numerator and denominator already lie 
in , because we trivialized the first n — m equations. Consequently a ^ n ' J -? lies in F^. 

Thus we have shown that all a(n,j) for bad j are multiples of a single one, call it g, 
by elements of Fjf. Now they can be simultaneously trivialized on replacing Xj by gX,. 
Again we must correct the new coefficients ^ of Xj in the ith equation by replacing 
Xt by ^ (i = m, . . .,n- 1). 

What happens if there is more than a single equivalence class on the bad elements 
of {0, . . . , m — 1}? Say there are h > 2 classes J\ . . . , Jh- Let I\ be the set of i in 
{m, . . . ,n — 1} for which there is j in J\ with a(i,j) ^ 0; and similarly for J 2 , . . . , 7^. 
Then ii, J 2 , . . . , Ih are disjoint, because for example with any ji in J\ and any j 2 in J 2 
there can be no i with a(i,ji)a(i, j 2 ) 7^ 0, else by (6.3) we would have ji ~ j 2 . (If one 
wishes, one can convert the matrix of the first n — m equations into a block matrix using 
row and column permutations.) The argument above, using i±, . . . ,i k -i in Ii, shows that 
all non-zero a(n, j) (j G Ji) are multiples of a single one, call it g±, by elements of F^. 
Similarly we get g 2 , . . . ,gh- Now we can trivialize the last row as follows. We replace the 
Xj (j G J\) by giXj and we correct the effect by replacing Xi by g\Xi (i G ii). Similarly 
using g 2 , . . . ,gh we trivialize the remaining coefficients. This completes the proof. 

7. Automorphisms. As above let K be a field, finitely generated and transcendental 
over F p , with G a subgroup of K* . Suppose a linear variety in P n is defined over K 
and G-isotrivial. Then by definition there is a G-automorphism ip taking it to something 
defined over ¥k = F p n K. To make our Theorems 1,2 and 3 fully effective we have to 
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estimate this tp; indeed after doing the whole descent to single points using Theorem 1, for 
example, it is mainly G-automorphisms that are left. 

Now it is convenient to use the projective height h p = h p defined on Pi_i(K) by 
h p (x!,...,xi) = logJJmax{|xi|^,...,|x z |^}. 

w 

This yields at once a height h(ip) of a G-automorphism ip, defined by (1.7), as 

h(ij}) = h p (g ,...,g n ). 

Also if V is linear in P n defined over K, it yields a height h(V) in the standard way via 
the Grassmannian coordinates of V; see for example [S] (p. 28), which however is in the 
context of number fields with euclidean norms at the archimedean valuations. Here we 
have no archimedean valuations, so the norm problem is irrelevant. If m — 1 > is the 
dimension of V, then its Grassmannians A(I) correspond to subsets I of {0, . . . , n} with 
cardinality n — m + 1 < n. The Northcott Property extends at once to this height. Also 
for ip in (1.7) the Grassmannians of ip(V) are the ^jj, where g(I) = gi- It follows 
easily that 

h(tfj(V)) < h(V)+nh(i(>), Hif;- 1 ) < nhty). (7.1) 
Less obvious is the following, which involves a second linear variety W also over K. 

Lemma 7.1. IfVdW is non-empty then we have h(V fl W) < h(V) + h(W). If further 
X n -i 7^ on V and the equations ofV do not involve X n , and W is defined by X n = aX n -i 
then h(V f)W)> max{h(V), h(W)}. 

Proof. The upper bound may be compared with the inequality h(V fl W) + h(V + W) < 
h(V) +h(W) due independently to Struppeck-Vaaler [SV] (Theorem 1 p. 493) and Schmidt 
[S] (Lemma 8A p. 28). These are proved over number fields; however it is easily checked 
that the proof in [S] remains valid with trivial modifications. Already a special case was 
noted by Thunder [T] whose Lemma 5 (p. 157) implies h(V + W) < h(V) + h(W) over 
function fields of a single variable provided V fl W is empty. 

Regarding the lower bound, let A(I) be the Grassmannians of V. Then it is easy to 
verify that the Grassmannians of V fl W consist of the A(I) together with the aA(J) for 
J not containing n — 1. There follows h(V fl W) > h(V) at once. Also X n _i ^ on V 
means that at least one A = A(J) is non-zero (see for example Theorem 1 of [HP] p. 298), 
so we get also h(V fl W) > h p (A, aA) = h(a) = h(W). This completes the proof. 
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It is the following result which enables ip to be estimated in the Descent Steps. 



Lemma 7.2. Suppose that V is defined over K and is G-isotrivial. Then there is a 
G- automorphism ip with ip(V) defined over Fk and hi^tp) < n\h(V). 

Proof. Suppose dimF = m — 1 with Grassmannians A(I); then as noted above the 
Grassmannians of ip(V) are the ^jjj, where g(I) = g%- If ip(V) is defined over F K then 
these have the shape Xa(I) for A in K* and a(I) in F^-. Thus we have A(I) = Xa(I)g(I) 
for all i"; but we can restrict to the set X of all / with A(I) ^ (and so a(I) ^ 0). We can 
eliminate the A by fixing I in X; this gives 

g{?)_ _ AWaM (IeI) , 72) 
g(I ) ~ A(I ) a(I) ^ 

Conversely (7.2) implies that ijj(V) is defined over F^. 

To solve (7.2) for g ,...,g n we divide the numerator and denominator of the left- 
hand side by #o~ m+1 and write it as (^) a( ~ 1,1 ^ • • • (^-) a(/ ' n) for integers a(I,i) which are 
0, 1, —1. If the vectors a(J) (/ G X) with coordinates a(I, i) (i = 1 . . . , n) have full rank n 
then we can solve (7.2) by choosing a(ii), . . . , a(J n ) linearly independent and then solving 
the subset of (7.2) with I = Ii, . . . , I n . A multiplicative form of Cramer's Rule gives 

with integers b ^ and bij. These bij are minors of a matrix with entries 0, 1, —1 and so 
\bij\ < (n-l)l 

Now taking heights leads to 

\b\h[ ] < max{|6 a |H h \b in \}h(Q l , . . . , Q n ). 

\9o 9o J i=i,.. .,n 

The height on the left is h(ip) and that on the right at most h(V). The result follows at 
once, at least under our assumption that the a(J) (/ G X) have full rank n. 

If this assumption does not hold, then we simply increase the rank by successively ad- 
joining unit vectors efc until the rank becomes n; this amounts to the addition of equations 
2^ = 1. Now we take a subset of n independent equations and solve again with Cramer. 
The resulting estimates are certainly no larger than before, and this completes the proof. 



8. A proposition. This, the main result of the present section, is a first step in the proof 
of the Descent Step over \^G, with V in P n (n > 2) either a hyperplane or defined over 
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a finite field. We continue with our assumption that K is finitely generated over F p ; thus 
= F p fl K is a finite field. Let G in K* be finitely generated of rank r > 1 modulo 
F^-; now we may write without confusion simply that G is finitely generated. It is known 
that the radical y/G, which by definition lies still in K, is also finitely generated (see for 
example [Mass] p. 195), also clearly of rank r over F^. For the moment we work exclusively 
with this radical. We further assume that K is transcendental over F p and we choose any 
separable transcendence basis £>; then we are free to apply the results of sections 3,4 and 
5 about heights h = hs and regulators R = Rs- 

We say that V is transversal if every coordinate Xi (i = 0, . . . , n) actually occurs 
in the defining equations. This property is independent of the choice of equations. Its 
purpose is to prevent "free variables" as in (1.1) with 7^ 0. 

Transversality is a harmless restriction because we could overcome it simply by work- 
ing in lower dimensions. Clearly every linear subvariety of a transversal variety is also 
transversal. Also a transversal variety must be proper (i.e. not the full P n ). 

We recall the function 5 from Lemma 4.3. 

Proposition. Let V be a transversal linear subvariety of P n defined over K , and suppose 
either that V has dimension n — 1 or that V is defined over some F q . Suppose also that 
V is not contained in any coset T 7^ P n . Let n be any point of V(y/G). 
If V has dimension n — 1, then either 

(i) there is a proper linear subvariety W of V , also defined over K, with 

h(W) < 8n 5 4 n d5(n + r)h(V) 2n R(VG) 2 , 

such that it lies in W(y/G), 
or 

(ii) there is a \[G -automorphism ip with 

h(ip) < n P 5(n + r)R(VG) 2 , 

a point it' and a linear subvariety V of~P n such that it = ^(tt' p ) and V = ifj(V' p ). 

If V is defined over F q , then either 
(i) there is a proper linear subvariety W ofV, also defined over K, with 

h(W) < 8n 5 4 n d5(n + r)R(VG) 2 , 

such that tt lies in W(y/G), 
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or 

(Hi) there is a point it' in P n (VG) with 

Proof. Suppose first that V has dimension n — 1. Then we just have to follow the arguments 
of the proof of Lemma 5 of [Mass] (p. 197). Because these arguments are expressed in terms 
of "broad sets" and this notion is no longer appropriate, we write out all the details. 

Because V is transversal, we may work affinely with a point n = (x\, . . . , x n ) satisfying 
a single equation 

a\X\ H h a n x n = 1 (8.1) 

with non-zero coefficients. As in section 3 write C for the field of pth powers in K, and 
consider 

s = dim c (CaiXi H h Ca n x n ), 

so that 1 < s < n. 

First suppose that s = n. Then we apply Lemma 5.1 with k = Fk, m = n and 
c\ = ■ ■ ■ = c m = 1 and g± = a\Xi, . . . ,g m = a m x m . So the group must be enlarged by 
adjoining ai, . . . , a n to \/G : becoming of rank at most n + r. The enlarged regulator R 
can be estimated by Lemma 4.2, and we find 

R < 2 n h( ai )---h(a n )R(VG) < 2 n h(V) n R(VG). (8.2) 

The conclusion (b) of Lemma 5.1 is ruled out by s = n; and the conclusion (a) shows that 

h(a\X\i . . . , a n x n ) < 4n 4 (i5(?i + r)R 2 . 

It follows that h(ir) = h(xi, . . . , x n ) is at most 

4n 4 d5(n + r)R 2 + h(a]; 1 ,...,a- 1 ) < 4n 4 d5(n + r)R 2 + nh(V) 

and so from (8.2) we deduce 

h(n) < 4n 4 4 n d5(n + r)h(V) 2n R(VG) 2 + nh(V) < 8n 4 4 n d5(n + r)h(V) 2n R(VG) 2 . (8.3) 

So this gives W = {it} for (i) of the Proposition; and for these h(W) = h(ir) is bounded 
as in (8.3). 

Next suppose that 1 < s < n. By means of a permutation we can assume that 
gi = aix\, . . . , g s = a s x s are linearly independent over C. Take any k with s + 1 < k < n; 
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then we can apply Lemma 5.2 with m = s and go = a k x k , \/G being enlarged as above. 
We find relations 

s 

^CkjCijXj = a k Xk {k = s + 1, . . . ,n) (8.4) 
with Cfcj in C and the quotients 

CI ' X ' 

fkj = c kj ^-^- (j = 1, . . .,s; k = s + 1, . . .,n) (8.5) 
a k x k 

satisfying 

Kfki,...J ka ) < 4s 4 d5(n + r)R 2 (k = s + 1, . . . , n) (8.6) 
We use (8.4) to eliminate the a k x k (k = s + 1, . . . , n) in (8.1). We find 

c\d\X\ H hc s o s x s = 1 (8.7) 

with 



1+ Ck ? 0' = l»---»s) (8-8) 

fc=s+l 



also in C. 



Next apply Lemma 5.1 with m = s to (8.7) and gj = cijXj (j = 1, . . . , s) also in the 
enlarged y/G. Again conclusion (b) is impossible. It follows that the 

fj = C 3 a 3 X 3 (j = 1, ••-,«) (8-9) 

satisfy 

MA. •••»/-) < 4s 4 rf5(n + r)i? 2 . (8.10) 



So in (8.5) certain quotients ^- are bounded modulo C whereas in (8.9) certain x 
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themselves are bounded modulo C. We can eliminate C by substituting (8.8) into (8.9) 
and using (8.5) to get 

n 

fj = a 3 X 3+ fkja k x k (j = l,...,s). (8.11) 

k=s+l 

Since a,j ^ (j = 1, . . . , s) these express the fact that 7r = (x\, . . . , x n ) lies on a linear 
variety V' of dimension n — s; and because s ^ 1 this dimension is strictly less than the 
dimension n — 1 of V. So we can take W as the intersection of V with V. This is in fact 
V because if we add up all the above equations (8.11) and use (8. 4), (8. 5), (8. 7), (8. 9), then 
we end up with (8.1). 



29 



Now we have to estimate the height of (8.11). In the corresponding matrix, every 
column has by (8.6) and (8.10) height at most 4s 4 d5(n + r)R 2 + h(V), which as above in 
(8.3) we can estimate by B = 8nH n d5(n + r)h(V) 2n R(VG) 2 . It follows that 

h(W) < sB < 8n 5 4 n d5(n + r)h(V) 2n R(VG) 2 . 

This too settles (i) of the Proposition. 

Finally suppose s = 1. This means that a±xi, . . . , a n x n are in C. By Lemma 4.4 with 
/ = p we can write Xj = gjx'J' with gj, x'j in \J~G (j = 1, . . . , n) and 

h{ gj ) < P 5(r)R(VG) 2 < p5(n + r)R(VG) 2 {j = l,...,n). 

Then ajgj is in C so has the form a J (j = 1, . . . , n). Finally 

1 = a\X\ H V a n x n = a'fx'f H h a^x'? = (a^x^ H h a n x' n ) p , 

and this gives part (ii) of the Proposition, with if; as in (1.7) above for go = 1, it' = 
(x'i, . . . , x' n ), and V defined by (8.1) above with the new coefficients a\, . . . , a' n . 

This proves the Proposition when V has dimension n — 1. Incidentally when the 
coefficients in (8.1) are in some F q , then the argument for s = 1 shows that 
in C So they are p-th powers . . . , x^ 3 ; and clearly a/G. Thus we get 

the conclusion (in) of the Proposition when 1/ has dimension n — 1. And the case s ^ 1 
leads of course to (z). So it remains only to treat V of dimension m — 1 < n — 1 defined 
over some F g . 

This we do by expressing the affine equations of V in triangular form, which after a 
permutation we can suppose are 

Xi = a i0 + aiiXi-\ h a^m-iXm-i (i = m, m + 1, . . . , n) (8.12) 

with the ciij in F q . This gives V = V m fl • • • fl V n for the varieties defined individually by 
each equation. 

Consider the first equation. There may be some zero coefficients a mj , but not all are 
zero, because V^v^G) is non-empty. In fact at least two are non-zero otherwise Would 
be contained in a coset T^P n contrary to our assumption. We can thus regard V m as 
a transversal variety of codimension 1 in some projective space of dimension at least 2 
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and at most m < n. Applying the Proposition for the cases already proved, we get two 
possibilities (i), (Hi). If (i) holds for V m , then we get a proper subvariety W m of V m with 

h(W m ) < 8n 5 4 n d5(n + r)R(VG) 2 . (8.13) 

But it is not difficult to see that each W m intersects the remaining intersection U m = 
fli^m ^ m a P ro P er subspace of V = V m fl U m . For example the triangular nature of 
(8.12) makes it clear that x m +i, . . . ,x n are determined by xi, . . . , x m -\ on U m , and then 
that x m is determined by x\, . . . , x m -\ on W m in V m ; but also some non-zero polynomial 
of degree at most 1 in x\, . . . , x m _\ must vanish on W m . So W = W m fl U m has dimension 
strictly less than m — 1. By Lemma 7.1 we have h(W) < h(W m ). So by (8.13) we get (i) 
of the Proposition for the original V. But what happens if (Hi) holds for V m ? 

This means that all the Xj actually occurring in the first equation of (8.12) are p-th 
powers, which certainly goes some way in the direction of (Hi) for V. But then we can try 
the second equation instead. Either we get a W as above, or all the Xj actually occurring 
in the second equation of (8.12) are p-th powers. And so on. In the end, we either get W or 
that all the Xj actually occurring in all the equations (8.12) are p-th powers. Because V is 
transversal this does give the full (Hi) for V; and so completes the proof of the Proposition. 

9. The main estimate. This is a quantitative version of our Descent Step over \[G 
without the requirement that the subvarieties W are isotrivial. This leads to a relatively 
small exponent attached to the height h(V). As before n > 2, and we continue with 
our assumption that K is finitely generated and transcendental over F p , with separable 
transcendence basis B and = F p fl K\ further G is finitely generated of rank r > 1 
modulo F^. 

Main Estimate. Let V be a positive- dimensional linear subvariety of P n defined over K 
but not a coset. 

(a) If V is not \/~G -isotrivial, then 

V(VG) = |J W(VG) 
wew 

for a finite set W of proper linear subvarieties W of V , also defined over K and with 



h(W) < 8n 2 d(10n 3 5(n + r)) 2n+1 h(V) 2n R(VGf n+2 . 
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(b) IfV is \[G-isotrivial and tp(V) is defined over F q , then 

(oo 
wew e=0 

for a finite set W of proper linear subvarieties W of V , also defined over K and with 

h(ip(W)) < 8n 5 4 n (q/p)d5(n + r)R(VG) 2 . 



Proof. We prove this first when V is transversal and not contained in any coset T ^ P n . 

We start with v^G-isotrivial V. Because we estimate h(i(j(W)) and not h(W), it clearly 
suffices to assume that ip is the identity, so that V is defined over F q . Take arbitrary n in 
V(y/G) not in V(Fk)- Then either (i) or (Hi) of the Proposition holds. 

If (i) holds, then (b) looks good with e = (and ifj the identity); at least it lies in 
some W(y/G) for a proper subvariety W of V, defined over K, with 

h(W) < 8n 5 4 n d5(n + r)R(VG) 2 . (9.1) 

What if (Hi) holds? Now any a in F q has a unique pth root a~p in F 9 , which is also a 
conjugate of a over F p . We get a new point 7r' in V'(\fG), also not in 7'(Fx), for a new 
variety V' in P n which is a conjugate of V. The new variety has the same dimension as 
V, and is also defined over F q . So we can repeat the process, and again we get either (i) 
or (Hi) of the Proposition. 

If (i) holds, then n' lies in some W'(VG) again with W' over K and h(W) bounded 
as in (9.1). So tt lies in (W'(VG)) P as in (b) with e = 1. 

Or if (m) holds, then we get a new point n" in for a new conjugate of V 

in P n . 

And so on, in a manner similar to the looping in the p-automata of [D] section 4. Be- 
cause 7T was not in V(Fk), this procedure must eventually stop at some proper subvariety 
\y( L ) over j{ f y( L ) (here the number L of repetitions might depend on tt). Now the 
original point tt lies in (W^ L \VG)) pL with h(W^) bounded as in (9.1). 

Because tt was arbitrary in V(\/G) not in the finite set V(Fk), the conclusion so far 

is 

oo 

ivew l=o 
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for a collection W of proper subvarieties W of conjugates of V defined over K and satisfying 
(9.1); here we may have to include single points W with h(W) = 0. To get equality we 
write q= and L = fe + I for e > and < I < f — 1; this gives 

oo 

with a new collection VV of proper subvarieties W = W pl of conjugates of V with 

h(W) = p l h(W) < 8n 5 4 n (q/p)dd(n + r)R(VG) 2 . 

Finally by intersecting each W with V = V q we can assume that each W is a proper 
subvariety of V itself in the above, without increasing the height further. Because V is 
defined over F q , the (W{\fG)) qe now lie in (V(VG)) q£ = V(VG), and so at last the two 
sides are equal. Now we have the desired (6); of course the finiteness of the collection of 
W follows from the Northcott Property already noted in section 7. This settles the case 
of transversal y / G'-isotrivial V not contained in a proper coset. 

Henceforth (until further notice) we will assume that V is not v^-isotrivial (and still 
transversal not contained in a proper coset). 

Suppose first that V is a hyperplane. Take arbitrary 7r in V{y/G). Then either (i) or 
(ii) of the Proposition holds. We regard this dichotomy as the starting stage 1 = 1. 

If (i) holds, then as before (a) of the Main Estimate looks good; at least it lies in some 
W{\fG) for a proper subvariety W of V, defined over K, with 

h(W) < Ch(V) 2n (9.2) 

for 

C = 8n 5 4 n dd(n + r)R{VG) 2 . (9.3) 
What if (ii) holds? We get a new point it' in V'(VG) for a new variety V in P n with 

7r = ^(7r /p ), V = tfj(V' p ). (9.4) 
Here ip is a v^G-automorphism with 

h(ip)<pB (9.5) 
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for 



B = nd(n + r)R(VG) 2 . 



(9.6) 



This V is also a hyperplane, and also not v^G-isotrivial. So we can repeat the process, 
and again we get either (i) or (ii) of the Proposition. This dichotomy is the next stage 
/ = 2. 

If (i) holds, then n' lies in some Wive). So tt lies in W(VG) for W = ^{W' p ), 
almost as good as above, except that h(W) could be larger than before. We take care of 
this later. 

Or if (ii) holds, then we get a new point 7r" in V"(y/G) for a new variety V" in P n . 
And so on. At stage / we get either 7r^ -1 ) in a proper subvariety W( l ~ 1 ' ) of I/* 7-1 ) 

with 

hiW^) < Ch(VV-V) 2n (9.7) 
as in (9.2) and (9.3), or a new point tc^ in V^(VG) for a new variety with 

n d-i) = ^(i-i) ((7r (i))P) j = ^C-i)((y(0)P). (9.8) 

as in (9.4), for 

h{^ l ~ l) )<pB. (9.9) 

as in (9.5) and (9.6). 

We claim that this procedure must eventually stop because V is not v^-isotrivial, 

and after a certain number L of repetitions which this time is independent of tt. Actually 
let us define the integer L > by 

p L ° < 2h(V)R(VG) < p Lo+1 . (9.10) 

From (9.8) we obtain V = ipi((V^) p ) with the v^-automorphism 

^ = ■ ■ ■ (^-^y 1 ' 1 . (9.ii) 

Writing the hyperplane V in the affine form (8.1), we know that some coefficient x = aj ^ 
does not lie in y/G, and x = gy p for some g in \[G and some y in K. We can now apply 
Lemma 4.5, because \/G~k there is just VG. We conclude that 

p l < 2h(x)R(VG) < 2h(V)R(VG). 
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In view of (9.10) this means that (ii) cannot hold for I = L + 1. Thus there is some L 
with < L < L such that (ii) holds at stages / = 1, . . . , L (at least if L > 1), and then 
(i) holds at stage I = L + 1. We conclude that 7r (L) lies in W^ L \ and from (9.7) 

h(W {L) ) < Ch(V {L) ) 2n . (9.12) 

Thus 7T = ip L {{n {L) ) pL ) lies in^ = ^((f^))^). By (7.1) and (9.11) we get 

h(W) < p L h(W^ L) ) + nh(^ L ) < p L h(W {L ^)+n(h(i(j)+ P h(ij') + ---+ P L - 1 h(ij {L - 1 ^ , 

which using (9.9) and (9.12) yields 

h(W) < Cp L h(V^ L) ) 2n + 2np L B < C(p L h(V {L ^) 2n + 2np L B. (9.13) 

To estimate h(V {L) ) we use (7.1), (9.8) and (9.9) to get 

ph(y il) ) = /i((^ ( ^ 1) )- 1 y ( ^ 1) ) < /i(y ( '- 1 )) + n 2 /i(^^- 1) ) < hiV^^ + ^pB. 

If L > 1 we multiply this by and sum from I = 1 to I = L, getting j^/j^y^) < 
/i(V) + 2n 2 p L B (which holds also if L = 0). Inserting this into (9.13) we get 

h(W) < C(h(V) +2n 2 p L B) 2n + 2np L B < 2C (h(V) + 2n 2 p L B) 2n , 

and then using (9.6) and (9.10) with L < L we find 

/ \ In / \ 2n 

h{W) < 2Ch(V) 2n (l + 4n 3 5(n + r)R(VG) 3 j < 2Ch{V) 2n ( 5n 3 5(n + r)R(VG) 3 ) 
From (9.3) we get finally 

h(W) < C'h(V) 2n R(VG) 6n+2 (9.14) 

with 

C = 16n 5 4 n d5(n + r)(5n 3 5(n + r)) 2n < 2n 2 d (l0n 3 5(n + r)f n+1 . 
Because re was arbitrary, the conclusion so far is 

V(VG) C |J W(VG) 
wew 

for a finite collection W of proper subvarieties W of V satisfying (9.14). But then the two 
sides are of course equal. This settles the Main Estimate for transversal hyperplanes V 
that are not v^G-isotrivial and not contained in a proper coset. 
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Next suppose that V, still not v^-isotrivial (and still transversal not contained in a 
proper coset), has dimension m — 1 for some m < n. So after a permutation of variables 
it can be defined by equations (6.1). Each of these equations defines a hyperplane V i: so 

that v = v m n • • • n v n . 

We claim that we can assume that all non-zero a(i,j) lie in \J~G. Otherwise for 
example V m is transversal and not -\ZG-isotrivial in the projective space with coordinates 
Xj corresponding to j = m and the j with a(m, j) 7^ 0. Since no X m — aXj (m 7^ j, a ^ 0) 
vanishes on V, this projective space has dimension at least 2. So then we could apply 
the hyperplane result (9.14) to deduce that all solutions lie in a finite union of proper 
subspaces W m of this V m with 

h(W m ) < C'h{V m ) 2n R{VGf n+2 . 

But as in the affine situation just after (8.13), it can be seen that W m intersects the remain- 
ing intersection U m = f] i9 L m V% in a proper subspace of V = Vm fl XJ m - For example the tri- 
angular nature of (6.1) makes it clear that X m+ i, . . . , X n are determined by X , . . . , X m _\ 
on C7 m , and then that X m is determined by X , . . . , X m _i on W m in V m ; but also some 
non-zero linear form in X , . . . , X m _i must vanish on W m . Therefore W = W m fl U m has 
dimension strictly less than m — 1. So we are indeed in a proper subspace as required by 
(a) of the Main Estimate. Further W = W m n V and so h(W) < h(W m ) + h(V) by Lemma 
7.1; moreover hiVm) < h(V) because the a(m, j) are themselves among the Grassmannian 
coordinates of V. We end up with (9.14) with say an extra factor 2. 

So indeed from now on we can assume that all non-zero a in (6.1) lie in y/G. This 
means that we are set up to apply Lemma 6.1. We will see that the effect is to pass to 
a proper subvariety of at least one of V m , . . . , V n despite their being separately isotrivial. 
As V is not v^G-isotrivial by assumption, we find some quotient (6.2), say Q, not lying 
in Fk- Let n = (£ , • • • ,£n) be any point of V(VG). For a typical factor ^(i'f) m Q we 
apply part (6) of the Main Estimate in lower dimensions to Vi, with ipi determined by 1 
and the non-zero a(i,j). So here q = p. We find finitely many proper subspaces Wi of V{ 
such that ipi(Vi(VG)) lies in the union of the {J7=o(MW l )(VG)) p£ , with 

h(ipi(Wi)) < 8n 5 4 n d5(n + r)R(VG) 2 (9.15) 

(now independent of p). In particular, writing 7^ for the projection of it to the lower 
dimensional space, we have equations 

M*i) = <7T (9-16) 
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for Oi in some ipi(Wi) and some power qi of p. Thus a(ij']\ 3 , = ^ ^ or cer tain 77 = r}(i,j,j') 
in K*. Multiplying all these over the factors in (6.2) we find Q — rjl 1 ■ ■ ■r\ q ] ^ for certain 
771, . . . , rjk in K* . Because the fixed Q is not in F^, this forces q = mm{qi, . . . , q k } to be 
bounded above by some quantity depending only on V. In fact h(Q) > q, but on the other 
hand from (6.2) we see that h{Q) < (n + l)h(V). Thus 

q<(n+l)h{V). (9.17) 

Say this minimum is q = qi. Now (9.16) says that 7Tj and so ix lies in the variety 
U = ip~ (ipi(Wi)) q of dimension strictly less than the dimension of Vi. This intersects Vi 
in a proper subvariety W[ of V,. Once more this W[ intersects the remaining intersection 
Oi>^i ^i' in a proper subvariety W of V. As for heights, we have W — W[ fl V so h(W) < 
h(W!) + h{V). Also h(W!) < h(U) + h(V t ) < h(U) + h(V), and also 

h(U) < qhtyiiWtf+nhtyT- 1 ) < qh^Wi)) + n 2 h(V,) 

because of the definition of ipi. Putting these together and using (9. 15), (9. 17) we conclude 
that 

h(W) < 8n 5 (n 2 + n + 3)4 n dS(n + r)h(V)R(VG) 2 . 

This is much smaller than (9.14), and so we have completed the proof of the Main Estimate 
when V is transversal and not contained in a proper coset. In case (a) we have reached so 
far the bound h{W) < Ah(V) 2n R 6n+2 with R = R(y/G) and A = 4n 2 d(10n 3 5(n + r )) 2n+1 
due to the extra factor 2 encountered after establishing (9.14). 

To treat the more general situation when V is transversal and not itself a coset, 
we use induction on n > 2, and we will obtain in case (a) the slightly weaker result 
h(W) < Ah(V) 2n R 6n+2 + nh(V). This leads at once to the bound given in the Main 
Estimate. 

If n = 2 then there is a single equation cioXq + a\X\ + a^Xi = 0, and transversality 
implies all 7^ 0. Thus no Xi — aXj (i 7^ j,a 7^ 0) vanishes on V, and we are done. Thus 
we can suppose that n > 3. 

After permuting the variables, we can suppose that X n — aX n -\ (a 7^ 0) vanishes on 
V. In the remaining equations for V we may eliminate X n to obtain a linear variety V 
in P n _i. This V cannot be a coset otherwise V would be. Also V certainly involves the 
variables X , . . . , X n _ 2 and so is transversal in P ft for n = n — 2 or h = n — 1. Here n > 2 
unless n = 3; but in that case if V is not transversal in P 2 then V would be defined by 
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equations X 3 = aX 2 and 60-^0 + b\X\ = so would be a coset. Thus we can assume that 
V is transversal in P„ with n > 2. 

Suppose first that V is not v^G-isotrivial as in (a). Then V cannot be v^G-isotrivial 
otherwise we could transform X n to make V isotrivial. Thus by induction the Main 
Estimate holds for V. It is now relatively straightforward to deduce the Main Estimate 
for V. Thus by case (a) for V we get 

V(VG) = [J Wive) (9.18) 
wew 

for a finite set W of proper linear subvarieties W of V, also defined over K and with 
h(W) < Ah(V) 2n R 6n+2 + (n - l)h(V). Now we will check that (a) for V follows with 
W defined by the equations of W together with X n = oX n _i. First the upper bound of 
Lemma 7.1 gives 

h(W) < h(W) + h(a) < Ah(V) 2n R 6n+2 + (n-l)h(V)+h(a). (9.19) 

We can suppose X n _i 7^ on V, else (9.18) would be empty; and so the lower bound of 
Lemma 7.1 gives h(V) > max{h(V), h(a)}. Therefore (9.19) implies 

h(W) < Ah(V) 2n R 6n+2 + nh(V) 

as required. 

And in case (b) for v^-isotrivial V (assuming as above that ip is the identity) we 
see that V is v^G-isotrivial and a lies in F q . We get (b) for V from (b) for V using the 
analogue V(VG) = [Jwevv U7= (W (VG)) qe of (9.18) with as above W defined by the 
equations of W together with X n = aX n _i; now h(W) < h(W). 

What if V is not transversal (and of course still not a coset)? Then it is transversal 
(and still not a coset) in some projective subspace of dimension n' < n — 1. Here n' > 2 
otherwise it would be a coset. The above cases (a) and (b) in dimension n' now lead 
immediately to the same cases in P n ; we have merely ignored n — n' projective variables 
that were never in the equations anyway. 

This finally finishes the proof of the Main Estimate. 

In view of the fact that the estimate in case (a) is independent of the characteristic 
p, it may seem a nuisance that the estimate in case (b) depends on p. But actually this is 



38 



unavoidable, and there are even examples to show that the full q/p is needed. To see this, 
take any power q > 1 of p, and define K = F q (t) with G = \/G generated by £, 1 — t and a 
generator ( of F*. Here we have r = 2, R(VG) = and, with the obvious transcendence 
basis, d = 1. The affine equations 

x + y = 1, x + (z = 1 

give rise to a v^G-isotrivial line V (with h(V) = and ip the identity), and an upper bound 
B in (b) would mean that all solutions over \[G are given by w,w q ,w q ,.. . for some w 
with h(w) < B. Thus every solution n would have either h(ir) < B or h(n) > q. But 



7T = (x,y,z) 



is a solution with h(n) = q/p. It follows that B > q/p. 



10. Isotrivial W. We show here how to ensure that all the subvarieties W in the Main 
Estimate can be made v^G-isotrivial, at the expense of enlarging the exponents in the 
upper bounds for their heights. To simplify the various expressions we abbreviate the 
factors in case (a) of the Main Estimate by 

A = A(n, r, d) = 8n 2 d(10n 3 5(n + r )) 2n+1 > 1, h = h(V), R = R(VG), (10.1) 

and that in case (b) of the Main Estimate by 

^ = r, d, p, q) = 8n 5 4 n (q/p)d5(n + r) > 1 

We also define some exponents 

(2n) m — 1 

p(m) = p n (m) = — — , 77(771) = rj n (m) = (2n) m (m = l,2, ...) 

2n — 1 

Main Estimate for isotrivial W. Let V be a linear subvariety ofP n defined over K but 
not a coset, with dimension m — 1 > 1. 
(a) If V is not \fG -isotrivial, then 

V(VG) = |J W(VG) 
wew 

39 



(10.2) 



for a finite set W of proper linear y/G-isotrivial subvarieties W ofV, also defined over K 
and with 

h(W) < (A# 6n+2 )^ m V( m ) (10.3) 
(b) If V is y/G-isotrivial and ip(V) is defined over F q , then 

v(Vg) = r 1 ( U Qw^)(^)) g J 

\wew e=0 / 

for a finite set W of proper linear y/G-isotrivial subvarieties W ofV, also defined over K 
and with 

Proof. We start with case (a) , and now we can write the bound as 

h(W) < Ah 2n R 6n+2 (10.4) 

with W not necessarily v^G-isotrivial. We show by induction on the dimension m — 1 > 1 
of V that the increased bound 

h(W) < (AR 6n+2 ) p{rn ^h^ m) (10.5) 

as in (10.3) holds where now all the W are v^G-isotrivial. 

When m = 2 then the W are points and so automatically v^G-isotrivial as long as 
W(\/G) is non-empty. 

When m > 3 we are fine unless some W is not v^G-isotrivial. We observe that such a 
W cannot be a coset T. For the latter is defined by finitely many Xi = aijXj (a^- ^ 0), 
and if T(\/G) is non-empty then clearly each lies in \/G. But now it is easy to see that 
T is v^G-isotrivial after all. For example we can rewrite the equations as aiXi = ajXj with 
ai , a j in y/G. Then we can set up an equivalence relation on {0, 1, . . . , n} characterized by 
the equivalence of such And now we need change only the variables in the equivalence 
classes of cardinality at least 2 in order to trivialize T. 

So by induction each of these W satisfies 

W(VG) = |J W(VG) 
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with v^G-isotrivial W such that 

h(W) < (Ai? 6n+2 )^ m - 1 )/i(W r )"( m - 1 ). 

Therefore all we have to do is substitute (10.4) into this. We find the upper bound (10.5) 
because 

p(m — 1) + 77(777. — 1) = p(m), 2nrj(m — 1) = 77(777). 

For case (b) we write the bound as 

h(ip(W)) < $>R 2 (10.6) 

with W not necessarily v^-isotrivial. If some W is not v^G-isotrivial, then neither is 
ip(W), and we can write 

^{w){Vg) = |J w*(Vg) 

wew 

with "s/G-isotrivial W* such that 

h(W*) < (Ai2 6n+ V (m ~ 1) >#(W'))' ,(m ~ 1) - (10.7) 

Now we can see (without induction) that the bound 

h(if>(W)) < (Ai? 6n + 2 )^ m - 1 )(^i? 2 )"( m - 1 ) (10.8) 

holds, where now all the W = ip~ 1 (W*) are v^-isotrivial. In fact just as above, all we 
have to do is substitute (10.6) into (10.7), and we find at once (10.8). This completes the 
proof. 

11. Points over G. We show here how to replace V(\J~G) and W{\fG) in the Main 
Estimate by V(G) and W(G) at the expense of worsening the dependence on the regulator. 
However we no longer insist that the W are isotrivial. If needed, this could be secured just 
by repeating the arguments of the previous section. We retain the notations (10.1), (10. 2) 
from that section. Of course n > 2, and we continue with our assumption that K is finitely 
generated over F p , with F K = F P (~)K] further G is finitely generated of rank r > 1 modulo 



41 



Main Estimate for points over G. There is a positive integer f = /k(G) < [y/G : G], 
depending only on K and G, with the following property. Let V be a positive- dimensional 
linear subvariety of P n defined over K but not a coset. 

(a) IfV is not \f~G-isotrivial, then 

V{G) = |J W(G) 
wew 

for a finite set W of proper linear subvarieties W of V , also defined over K and with 

h{W) < Ah 2n R(VGf n+2 . 

(b) If V is y/G-isotrivial and ip(V) is defined over F q , then either 

(ba) we have 

V(G) = |J W(G) 
wew 

for a finite set W of proper linear subvarieties W of V , also defined over K and with 

Ki){W)) < \F K \^R(G) 2 

or 

(bb) we have 

v(G) = v-M U Qww)(G)y fe ) (11.1) 

\wew e=0 / 
for a finite set W of proper linear subvarieties W of V , also defined over K and with 

h(ip(W)) < q f \F K \^R(G) 2 . (11.2) 



We need first a simple remark about congruences. Here <fi is the Euler function. 

Lemma 11.1. For a given power Q > 1 of a prime P consider a finite collection of 
congruence equations 

LQ e = M mod N (11.3) 

with N taken from a finite set Af of positive integers and L, M taken from Z. Suppose that 
the set of solutions e > is non-empty. Then if there is some M^O with ordpM < ordpN 
this set is 

(a) finite with Q e < max7v 6 _^ N, 
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and otherwise 

(b) a finite union of arithmetic progressions e = eo,eo + /, eo + 2/,. . . with f = 
H-NeAT^N) and Q C ° < Q f max. Ne tfN. 

Proof. Suppose first that there is some M^O with ordpM < ordpA^. Then the corre- 
sponding L 7^ 0, and we get 

eordpQ < ordpLQ 6 = ordpM < ordpiV 

giving case (a). 

Thus we can assume that ordpM > ordpA" whenever M^O. We proceed to verify 
case (b). Now the congruences (11.3) can be split into congruences modulo powers of P 
and congruences modulo powers P m of other primes P ^ P. 

The former congruences, if any, will be satisfied as soon as e is sufficiently large. 
Indeed they amount to LQ e = mod p ord r> N and so conditions e > A for various real 
A < that is, Q x < P mA P N < N. Thus together they give a single condition e > A 

for some real A with Q A < maxNeAf N. 

We note that whether e satisfies the other congruences depends only on its congruence 
class modulo /. For if P m divides some then 4>(P m ) divides <j>(N) which divides /, and 
so Qf = 1 mod P m . 

Thus the solutions e satisfy e > A and also must lie in a finite number of arithmetic 
progressions modulo /. If eo is the smallest member of one of these progressions with 
eo > A, then e — / < A and this leads to case (b) , thereby completing the proof. 

We can now start on the proof of the Main Estimate for points over G. 
Suppose first that V is not v^-isotrivial. Then (a) of the Main Estimate gives 

v(Vg) = |J w(Vg) 

wew 

for W satisfying (10.4). Now we can descend to G simply by intersecting with P n (G). 

Next suppose that V is v^-isotrivial and ip(V) is defined over F q . Using elementary 
divisors we can find generators 71, . . . ,7 r of \/G modulo constants and positive integers 
di, . . . , d r such that 7^ , . . . , 7^ r generate G modulo constants. The constants can be taken 
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care of with an extra 70 generating \/~G fl and 7q° generating G fl F^; here d divides 
the order of 70 as a root of unity. Thus 

[VG : G] = d di ■■■d r . (11.4) 

We write 

ip{X , . . . , X n ) = (ip X , . . . , ip n X n ) 

with 

A = 7o 0l 7i ll ---7r (i = 0,...,n) (11.5) 
in \J~G. Now (b) of the Main Estimate gives 

V(VG) =f ! MJ \JmW)(VG))A (11.6) 

wew e=0 / 

for satisfying (10.6). But we can no longer descend to G simply by intersecting with 
P„(G). 

Consider a point 7r = (-7r , • • • , of V(G). By (11.6) there is a point a = (o"o, . . . , a n ) 
in some W(y/G) and some e > such that n = i(j~ 1 (i(j(a)) qe . As in (11.5) we write 

<r* = 7S 0i 7i 6l< ---7r ri (i = 0,...,n); (11.7) 

however 7r is over G and so 

TTi = 7o c ^r dl ... 7r c ^ (i = 0,...,n). 

Equating exponents we find a system of congruences 

(ciji + bji)q e = ciji mod dj (i = 0, . . . , n; j = 0, 1, . . . , r) (H-8) 

depending only on a. We can apply Lemma 11.1, and the argument splits into two accord- 
ing to the conclusion. As the bji in (11.7) appear only in the coefficients L, the splitting 
is independent of a. 

Suppose first that Lemma 11.1(a) holds. Then 

q e < max{rfo, di, • • • , d r } < d di---d r = [VG : G] (H-9) 

by (11.4). Now 7T lies in the finitely many W = i(j~ 1 (i(j(W)) q , which we can put together 
into a set VV, and then we have shown that 

V(G) C |J W(VG). 
wew 
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Now intersecting with ~P n (G) gives the same inclusion but with W(G) on the right-hand 
side. On the other hand 

w = V _1 WW) ?e c v _1 (^(^)) ge = V> _1 (VW) = ^ 

because V'(^) is defined over F 9 . Thus we conclude 

V(G) = |J W(G) 
wew 

as in (ba) of the Main Estimate for points over G. But now from (11.9) and (10.6) the 
heights satisfy 

h(1>(W)) = q e h(i;{W)) < d d 1 ---d r ^R(VG) 2 . 

Using Lemma 4.1 we see that R(G) = d\ ■ ■ -d r R(VG), and so we can absorb some terms 
into the regulator to get 

h(ip(W)) < do^R(G) 2 < \F K \^R(G) 2 . (11.10) 

This completes the proof of (ba). 

It remains only to suppose that Lemma 11.1(b) holds. Then we know that e = e + fe 
with e > and eo bounded as in (11.9) but with an extra qf . In particular taking e = 
we get a solution of (11.8) and this means that a = ip~ 1 (ip(a)) qe ° is also defined over G. 
It lies in 

W = i(>- 1 (il>(W)) qB0 (11.11) 

and so in W(G). We also have 
for q = qf . Thus we conclude 

v(G) c r 1 ( U Q(v«( G ))H ( 1L12 ) 

for the finite set W of IF in (11.11). On the other hand 

^(wy s = (iP(W)) qe °^ C (^(F)) 9 " ^ = 



again because ^(V) is defined over F g . Thus we conclude equality in (11.12). 
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Finally we calculate that h(ip(W)) = q e °h(ip(W)) is bounded above by 

q f max{d ,d 1 ,...,d r }^R(VG) 2 < q f \F K \^R(G) 2 (11.13) 

as in (11.10), and of course / = 4>(do)(/)(di) ■ ■ ■ (f>(d r ) depends only on K and G with 

/ < d d 1 ---d r = [VG:G\. 

This completes the proof of (bb) ; and so the Main Estimate for points over G is proved. 

In (11.13) the term qf cannot be so easily absorbed into the regulator without intro- 
ducing an exponential dependence on R(G). Let us discuss some aspects of this. 

When G = VG then / = 1 in (bb) and we are more or less back to (b) of the Main 
Estimate. But in general we need the extra / in (11.1). The following example shows that 
it sometimes must be almost as large as [\/G : G]. 

We go back to the equation t m x + y = 1 of (1.5) over K = F p (t), with n = 2. It is to be 
solved in the group G = Gi generated by t l and 1 — t, so that r = 2. Here y/G is generated 
by t and 1 — t together with a generator £ of F*. The equation defines a v^G-isotrivial line 
V with ip(x, y) = {t m x, y) = (x, y), so that V = ip(V) is defined by x + y = 1, with q = p. 

Now Leitner [Le] has found all points on V(y/G). If p is odd there are p — 2 constant 
points in F 2 together with six infinite families 

(x,y) = (x p Q ,yl ) (e = 0, 1,...), 

where {xo-,yo) are given by 

The (x, y) = ip~ 1 (x, y) = (t~ m x, y) are all the points on V(\/G). Choosing m not divisible 
by /, we see that none of the constant points give rise to points of V(G). Similarly for the 
second family above. And the same is true of the last four families above, simply because 
of the minus signs. However the first family gives (t~ m t p , (l-t) p ), which is in G 2 if and 
only if 

p^ = m mod /. (11-14) 

Now Artin's Conjecture implies that given any prime p, there are infinitely many 
primes / for which p is a primitive root modulo /. And Heath-Brown's Corollary 2 of 
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[He] (p. 27) implies that this is true for at least one of p = 3, 5, 7. We can choose m 
with 1 < m < I with p l ~ 2 = m mod /. Now (11.14) implies e = / — 2 mod I — 1 so 
e = / — 2 + (/ — l)e (e = 0, 1, . . .). Thus the surviving points on V(G) are just the 

7T = V _1 (^(W)) p(i " 1)e (e = 0,l,...) (11.15) 

I 2 £ 2 

with as the single point (t~ m t p , (1 — t) p ). This makes it clear that / > / — 1 in 
(11.1); almost as big as [y/G : G] = (p — 1)1 for fixed p. 

We could also see this from (11.2). For as R(G) = ly/3, it implies that there would 
be a point 7r on V(G) with h(ip(Tv)) < cpH 2 for c absolute. But the point (11.15) has 

y = y = (1 - t)P P SO 

h{ip{iT)) > p l - 2 p {l - 1)e > p l ~ 2 . (11.16) 

Making / — > oo, we deduce / > I — d log/, also almost as big as [\/G : G] = (p — 1)1. 
Less precisely, there can be no estimate 

h(<iP(W)) < C(n,r,K)(h(V) + R(G)) K 

replacing (11.2) which is polynomial in h(V) and R(G) for fixed n,r,K. For this would 
give a point with h(ip(ir)) < c"(m + l) K < c"'l K ', contradicting (11.16). Similarly one sees 
that if the dependence on h(V) is polynomial, then the dependence on R(G) must be 
exponential. This explains the large solutions like (1.16), with p = 2, / = 83, m = 42. 

12. Proof of Descent Steps and Theorems. In the Descent Steps the variety V 
is certainly defined over a finitely generated transcendental extension K of F p , and now 
we can choose any separable transcendence basis to obtain a height function. Now the 
Descent Step over \/G follows from the Main Estimate for isotrivial W. And the Descent 
Step over G follows, at least without the assumption that the W are v^G-isotrivial, from 
the Main Estimate for points over G. This assumption can be removed by induction just 
as in section 10 (without bothering about estimates): any W that is not v^G-isotrivial can 
be replaced by a finite union of v^-isotrivial varieties. 

To prove Theorem 1 we may assume that V has positive dimension. We apply the 
Main Estimate for points over G repeatedly, taking always q = \Fk\^ k ^ for safety. With 
Vq = V, an arbitrary point n of Vq{G) is either a point of W(G) for finitely many W in 
Vq with dim < dimF — 1, or a point , 0^ 1 (/7 ei 'i/'i(7ri) for tt\ in V\(G) for finitely many V± 
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in Vo with dimV 7 ! < dimF — 1 and some e\ > 0, with ipi(Vo) defined over F^. Then we 
argue similarly with 7Ti; and so on. After at most dimF < n — 1 steps we descend to cosets 
T = Vh, and only finitely many tpi , . . . , ipf,, turn up on the way, leading to expressions as 
in (1.12) and thereby establishing Theorem 1. 

For later use we note that not just the varieties T but also the whole unions [ipi, . . . , i^h]T 
lie in the variety V. Why is this? Well, a typical point of the union has the shape 
7T = (ip^ (p ei tpi) ■ ■ ■ (ip^ (p eh iph)(T) for some ei, . . . , and some r in T. The descent for 
Theorem 1 provides linear varieties V = Vq, V\, . . . , Vh = T. Now clearly r lies in T inside 

Vh-i, so ip^ip^iphir) lies in 

inside V h -2- In the same way (^\ip eh - 1 tl) h - 1 ){'ip h ~ l ip eh 'ip h )(T) lies in V h - 2 inside V h - 3 . 
Continuing backwards we see that tt = {ip^^ipi) ■ ■ ■ (ip h ~ 1 (p eh iph)(T) lies in V. 

We leave it to the reader to check, by a straightforward induction argument like that 
in section 10 and also using Lemma 7.2, that for Theorem 1 one can take 

max{h(il) 1 ),...,h(il) h ),h(T)} < (2q 2 AR(Gf n+2 ) p{m) h(V) ll{rn) (12.1) 

in the notation of section 10. This indeed looks polynomial in R(G) and h(V); however, 
as we noted, an exponential dependence on R(G) may be hiding in q = \Fk\^ k ^ ■ 

For the symmetrization argument in the proof of Theorem 2 we need a version of 
Lemma 8.1 (p. 209) of [D], partly removed from its recurrence context. 

Lemma 12.1. For m > 1 and xi, . . . , x m , yi, . . . ,y m in K suppose that 

xwf + ■ ■ ■ + x m yi = (12.2) 

for all large I. Then this holds for all I > 0. 

Proof. The proof will be by induction on m, the case m = 1 being trivial. For the induction 
step we can clearly assume that xi,...,x m are non-zero. Now we note that (12.2) for 
any m consecutive integers / = g,g + 1, . . . ,g + m — 1 implies the linear dependence of 

yi, . . . , y m over F q . For if we regard these as linear equations for x\, . . . , x m , the underlying 

j — i 

determinant is the q 9 power of that with entries y\ (i,j = l,...,m), and it is well-known 
that the latter, a so-called Moore determinant, is up to a constant the product of the 
Piyi-\ \-/3 m ym taken over all . . . , /3 m ) in P m _i(F g ) (see for example [Go] Corollary 
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1.3.7 p. 8). Thus after permuting we can suppose that y m = a\y\ + ■ ■ • + a m _i?/ m _i for 
a m -i in F q . Substituting into (12.2) gives 

i i 
(x 1 + a 1 x rn )yf H h (x m -i + u m -iXm)y q m -i = 0, 

which therefore also holds for all large /. By the induction hypothesis we conclude that 
this holds for all / > 0, which leads back to (12.2) for all / > and thus completes the 
proof. 

To prove Theorem 2 consider a single [ipi, . . . , iph\T(G) coming from Theorem 1. Fix 
t in T(G); then T — t S for a linear subgroup 5. 

We argue first on the geometric level. According to (1.12) a typical point of [V'i, ■ • • , iph]T 
has the shape 

with qi = q ei (i = 1, . . . , h) and a in S; here we are regarding the tpi (i = 1, . . . , h) as 
multiplication by points instead of automorphisms. This expression can be written as 

7r 7rf 7r| 192 ■ ■ ■ 7r ^ i ^-i 7r £i-"'^---^ (12.3) 

with 

7T = Vf 1 ' = ^Vl, • • • , T^h-l = ^"V/l-b 7T/i = ^^"0- (12.4) 

Now when we write q li = q\ • ■ ■ (i = 1, . . . , h) we certainly get a point of (7r , 7Ti, . . . , 7T/ l )5' 
according to (1.14); but at the moment we have asymmetry Zi < • • • < 1^. We eliminate 
the inequalities here as in [D] (p. 212). 

Let us start with the last inequality. We can write (12.3) as £ > rj ql with £ and r\ 
independent of / = lh- We already remarked that . . . , iph]T lies in V, so (12.3) does. 
Thus for each linear form C defining V we have £(£ > rj ql ) = for all li, . . . , lh-i, I with 
< h < • ■ • < lh-i < I- Fixing /i, . . . , lh-i, we see from Lemma 12.1 that this equation 
for all large / implies the same equation for all / > 0. Thus the inequality lh-\ < h 
has indeed been eliminated. Similar arguments work for the other conditions, as is clear 
from the arguments of [D] (p. 212) after equation (22). For example, the next step fixes 
h,. .., lh-2, h but not / = lh-i. 

Looking back at (12.3), we have therefore proved that all the points 

TroTT^^-lX^ (12-5) 
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lie in V, where the integers = q li (i = 1, . . . , h) now range independently over all positive 
integral powers of q. This is the required symmetrization at the geometric level. 

It actually shows that the entire (tvq, 7Ti, . . . , iTh)S lies in V. For a typical point of the 
former has the shape 

-icEt*?* (12-6) 

for a in S. And there is a in 5 with a rh = a. This could be interpreted as something 
about the divisibility of group varieties; but for us it is just a simple consequence of the 
fact that S is defined by equations Xi = X,. And now (12.6) and (12.5) are equal. 

At the arithmetic level we claim that (n 0l 7Ti, . . . , nh)S(G) lies in V{G). In fact every 
point 

tt = W?*?---^K h (12-7) 

with asymmetry r± < ■ ■ ■ < rn has the shape (12.3) (with all coordinates of a equal to 1). 
It therefore lies in [ipi, . . . , tfjh]T(G) which is in turn contained in V(G). In particular n 
lies in P n (G). But why does it continue to lie in P n (G) when the asymmetry is lifted? 

Well, we can take T\ = ■ • ■ = Th = 1 in (12.7) to see that the product 

7T07T1 • • • TTh (12-8) 

lies in P n (G). Then taking n = • ■ • = rn-i = 1,^ = g we can deduce that 7r^ _1 lies 
in P n (G). And taking r\ = ■ • ■ = = l»^h.-i = r h = Q we deduce that 7r^l} lies in 
P n (G). And so on, until we see that all of 

Trr 1 ,...,^- 1 (12.9) 

lie in P n (Cr) (this was already remarked in section 1). 

And now if n, . . . , rh are arbitrary integral powers of q in (12.7) we can write 

tt = (n n 1 ■ ■ ■ n^nl 1 ' 1 ■ ■ ■ n 1 ^- 1 

to see from (12.8) and (12.9) that indeed n lies in P n (G). 

Now any point of (ttq, tti, . . . , TTh)S(G) by (12.5) has the form na rh with n as above 
and a in S(G). It follows that (ttq, tt\, . . . , iih)S(G) lies in V(G) as claimed. 

On the other hand, taking all coordinates of a as 1 in (12.3) shows that [ipi, . . . , iph]{T~o} 
lies in (7To,7Ti, . . .,iTh)S(G). As we could have fixed r arbitrarily in T(G), we see that 
. . . , ip h ]T(G) lies in (ttq, tti, . . . , 7r h )5(G). 
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It follows that V(G) is indeed the union of the (ttq, 717, . . . , iih)S(G), which completes 
the proof of Theorem 2. We note for later use the fact, already observed, that each 
(tto, 7Ti, . . . , 7Th)S is contained in V. 

Here too we leave it to the reader to check using (12.1) that for Theorem 2 one can 

take 

max{h(n ),h(n 1 ),...,h(n h )} < (n+l)(2q 2 AR(Gf n+2 ) p{m) h(V)^ m) . (12.10) 

This follows quickly from (12.4) and the easy fact that any T(G) contains r with h(r ) < 
h(T). 

To prove part (1) of Theorem 3 we start from Theorem 1 with V = H. We first claim 
that if some 7r in H(G) lies in some . . . , iph]T(G) with T not a single point then some 
(1.2) fails for it. To see this, note that if T is not a single point, then there is a partition 
of {0, 1, . . . , n} into proper subsets i", J, . . . such that T is defined by the proportionality of 
the homogeneous coordinates Xi (i e I), X, (j e J), and so on. We may suppose that I 
contains and that the equations corresponding to I are giXo = goXi for i in I. Consider 
the point 77 in P n whose coordinates Xi = gi for % in / but with all other coordinates zero. 
It also lies in T. 

Now 7T = (ip^^ipi) ■ • • (^ 1 if eh i^h)(T) for some ei, . . . , eh and some r in T. From our 
remark following the proof of Theorem 1, we see that 717 = (-i/^ V 61 -!/^) • • • (t(j^ 1 ip eh i(jh)(ri) 
lies in H. Now r and 77 have the same coordinates (i G I). It follows that tt and 717 
have the same coordinates Xi (i e I). Since the other coordinates of 717 are zero, this 
means that (1.2) fails for tt as claimed. 

Therefore H*(G) is contained in a finite union of sets [ipi, . . . , 7/7,] {r}. And each of 
these lies in H(G). This proves part (1) of Theorem 3. 

Part (2) follows in a similar way with the help of the remark after the proof of Theorem 
2, with tv = 7Tq(lp 1i 7Ti) ■ ■ ■ (ip lh iih)(J and 717 = 7r (</?' 1 7ri) • • • (<p h 7Th)cri for 07 defined by 
Xi = 1 for i in / but with all other coordinates zero. This shows that we can restrict 
to single points S, and the proof is finished as above. We have therefore proved all of 
Theorem 3. 

It is easy to deduce explicit estimates for Theorem 3 as for Theorems 1 and 2. One 
obtains at once (12.1) (with T replaced by r) and (12.10). 

13. Limitation results. We show here that for each n > 2 the bounds h < n — 1 in 
Theorems 1 and 2 cannot always be improved; and also that if p > 2 the tpi, . . . , iph in 
Theorem 1 and the 7Tq, tt\, . . . , 717 in Theorem 2 cannot always be chosen over G. 
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We start with h < n — 1. Because Theorem 1 directly implies Theorem 2 and then 
Theorem 3, it will suffice to prove the analogous statements for Theorem 3. Also we have 
seen that each [ipi, . . . , Vvi]{ T } in Theorem 3(1) is contained in some (ttq, 7Ti, . . . , ith) in 
Theorem 3(2). So it is enough to treat Theorem 3(2). 

This we do with the affine hyperplane 

xi+x 2 -x 3 x n = 1 (13.1) 

already mentioned. 

We need a simple observation. For a prime p let R = R p be the set of points 
(1, ri, . . . , r n _i) as the integers ri,...,r n _i run through all powers of p satisfying the 
asymmetry conditions that ri divides ri + i (i = 1, . . . , n — 2) and also the extra conditions 

+ + r n _ 3 + ... + ri . (13.2) 

Lemma 13.1. The set R does not lie in a finite union of proper subgroups of Z n . 

Proof We can actually disregard (13.2) because their failure would just add more to the 
finite union of proper subgroups. Now the falsity of the lemma would lead to an equation 

F{p e \...,p e ^) = (13.3) 

holding for all non- negative integers e±, . . . , e n _i, where J-(yi, ■ ■ ■ , y n -i) is a finite product 
of polynomials 

A = a + ciiyi + a 2 yi2/2 H h Un-WiVi • • • Vn-i 

corresponding to the proper subgroups of Z n perpendicular to (ao, . . . , a n -i) 7^ 0. It is 
clear that each A 7^ and so J 7 7^ 0. On the other hand it is easy to see that the points 
in (13.3) are Zariski-dense in R n_1 . This contradiction proves the lemma. 

Take as usual K = F p (t) and G generated by t and 1—t. We proceed to exhibit many 
points on H*(G) with H defined by (13.1). 

For integral powers qi, ... , q n -i of p define 

n = q n -t, r 2 = q n -iQn-2, • • • , r n -i = Qn-i •••qi 

and 

di = r n - 1 -r n - 2 r 2 -ri, 
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down to 

d n -2 = r n -i - r n _ 2 

and 

d n -i = r n -i. 

Then 

Xl = t dl , x 2 = 1 - t^- 1 , z 3 = t d "- 2 - t*"- 1 , . . . , x n = t dl - t d2 (13.4) 
certainly satisfy (13.1), so the point £ = (xi, . . . ,x n ) lies in i7. It is in H(G) because 

X2 = 1 -f- 1 = (l-ty™- 1 , 
X3 = t d »-*(i-t Vn - a ) = t d - 2 (i-t) r - 2 , 

and so on. 

This also leads to a multiplicative representation 

* = er-'-C-i 1 (i3-5) 



of the point in (13.4), where 



down to 



but 



£1 = (|,1,1,1,1,. ..,1,1,^), 

6 = 1,1,1,1,. ..,1,^,^) 

6 = ^,1,1,1,1,..,^,^) 



Cn-l — {t,l — t,t,t,t, . . . ,t,t,t). 

We can quickly check that £1, . . . , £ n -i are multiplicatively independent. Namely, a relation 

^•••di 1 = (1,1, 1,1,1,.. ,1,1,1) 

would lead to a n _i = on examining the second components, then a n _2 = from the 
third components, and so on down to a\ = 0. 

53 



The case n = 3 with q 1 = g, q 2 = r is of course (1.11) or (1.13). 

We can see that (13.4) lies in H*(G) provided (1, ri, . . . ,r n _i) lies in R. For the 
various exponents of t clearly satisfy d n -i > d n -2 > ■ • • > > d\. There is one more 
exponent 0; but d n -i ^ and from the definition of R we also have d n -2 ^ 0, . . . , d\ ^ 0. 
Thus the exponents d n -i, . . . , d±, in (13.4) are distinct, and it is easy to see that there 
can be no vanishing subsum of xi, X2, — £3, • • • , — x n (in fact each of d n -2 = 0, . . . , d± = 
does lead to a vanishing subsum). We already remarked that (1.13) is in H* as long as 
r^s, that is q\ 7^ 1, that is T2 7^ r\ as in (13.2). 

Now we can prove as promised that H*(G) does not lie in a finite union of sets 

00 00 

IT = (7ro,7ri,...,7Th)q = [J ' U 7r ° 7r l ' " ' ^ ( 13 - 6 ) 

h=0 l h =0 

for some q and points ttq, 7Ti . . . , 7T/j with h < n — 1. The idea is to note that each IT lies in 
a coset of of dimension at most h < n — 2; whereas the points (13.5) have rank n — 1. 

Accordingly we assume that H* (G) does lie in such a finite union and we shall reach 
a contradiction. 

Now for each element of R the corresponding (13.5) lies in H*(G) so in some IT. This 
provides a partition of R into a finite union of subsets Rjj. By Lemma 13.1 we will be 
through if we can prove that each R u lies in a proper subgroup of Z n . 

Suppose for some IT we are lucky in the sense that the corresponding hq in (13.6) is 
multiplicatively independent of £1, . . . ,£ n -i- The corresponding 

^0 S - ^0 Si •••Sn-1 

all lie in the group generated by tt\, . . . , 717^ and so the multiplicative rank of the various 
Tr^ 1 ^ is at most h < n — 2. Since tTq , £1, . . . , £ n -i are independent, it follows that the 
set i?n cannot contain n (or even n — 1) independent elements. So it must indeed lie in a 
proper subgroup of Z n . 

In fact we are not so likely to be that lucky, and it is more probable that there is a 
relation tt£ = f J 1 • • • C-I 1 with a^0. Now the 

— ci£a fori-ai >.ar„_ i — a„_i 

^0 S SI '"'Sn-l 

still lie in a group of rank at most n — 2. Since £1, • • • , Cn-i are independent, we deduce 
as above that the set of all (ari — ai, . . . , ar n _i — a n _i) lie in a proper subgroup of Z n_1 . 
And this implies as above that Ru lies in a proper subgroup of Z n . 
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That finishes the proof of the first limitation result. We could also have argued with 
a symmetrized version of R; then the A in the proof of Lemma 13.1 could be taken more 
simply as a + a x y x + a 2 y 2 H h a n _iy n _i. 

We can use similar arguments to prove the second limitation result concerning non- 
definability over G. Because the [ipi, ... , ifjh]T(G) in Theorem 1 lead to (ttq, 7Ti, . . . , Hh) in 
Theorem 2 with (12.4) for To in T(G), it will again suffice to check the matter for Theorem 
3(2). 

This we do with the affine line H defined by tx + y = 1 also over K = F p (t), now 
with G generated by t v ~ x and 1—t. It is the example treated at the end of section 11 with 
m = 1 and I = p — 1. We need another simple observation. 

Lemma 13.2. For an odd prime p suppose that 

qi + q2 + q3 = <?i + <?2 + qs (13.7) 

for integral powers q±, q 2 , qs, qi, q 2 , <?3 of p. Then q±, q 2 , qs are a permutation of qi, q 2 , q%. 

Proof. If qi, q 2 , qs are all different then the left-hand side of (13.7) has just three ones in 
its expansion to base p. So also the right-hand side; which means that gi, q 2l g 3 are also all 
different. The result in this case is now clear (even for p = 2). If say qi ^ q 2 = qs then we 
get a one and a two in the expansion because p 7^ 2; so after a permutation qi ^ q 2 = qs 
too, and the result is still clear. Similarly if qi = q 2 = qs as long as p 7^ 3. This last case 
can also be checked directly when p = 3 and this proves the lemma; however the example 
1 + 1 + 4 = 2 + 2 + 2 shows that p = 2 is not to be saved. 

Now the analysis in section 11 before the primitive root business shows easily that the 
points of H*(G) = H(G) are given by 

x = f~\ y=(l-t) r (r = l,p,p 2 ,...). (13.8) 

This is (x, y) = £o£i for £ = 1) and fi = (£, 1 - t). Assume p ^ 2. If H*(G) were 
contained in a finite union of 

00 

n = (7r ,7ri)g = [jn nf 
1=0 

for some q and some 7r ,7ri over G, then one of these IT would certainly contain at least 
three different points (13.8). This gives equations 

£o£i = 7i"o7i"i, £ £[ = n nl , £ £i = ^0^1 (13.9) 



55 



for powers r < r' < r" of p and powers s, s', s" of q. Eliminating ttq, tt\ leads to 

(^D s '- s 'Wr"- s (^r"r- s ' = i; 

that is, £i=l for 

a = r(s' -s")+r'(s" - s) + r" (s - s'). 

So a = 0; that is, 

rs' + r's" + r"s = rs" + r's + r"s' . 

Lemma 13.2 shows in particular that rs' is one of the terms on the right. But which one? 
Certainly rs' 7^ r" s' . And rs' 7^ rs" else s' = s" and (13.9) would imply r' = r" . It 
follows that rs' = r's. But now eliminating £1 from the first two equations in (13.9) leads 
to £q ~ r = n o~ r ■ Thus there would be a, (3 in F p with (at -1 , (3) = (a, (3)^ = n ; however 
this is impossible because at~ l is not in G if p 7^ 2. 
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